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real-world  simulator  programs. 
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Abstract 

The  recognition  of  standard  computational  structures  (cliches)  in  a 
program  can  help  an  experienced  programmer  understand  the  program. 

Based  on  the  known  relationships  between  the  cliches,  a  hier2n'chical 
description  of  the  program’s  design  can  be  recovered.  We  develop  and 
study  a  graph  parsing  approach  to  automating  program  recognition  in 
which  programs  are  represented  as  attributed  dataflow  graphs  and  a 
library  of  cliches  is  encoded  as  an  attributed  graph  grammar.  Graph 
parsing  is  used  to  recognize  cliches  in  the  code. 

We  demonstrate  that  this  graph  parsing  approach  is  a  fe2isible  and 
useful  way  to  automate  program  recognition.  In  studying  this  ap¬ 
proach,  we  have  experimented  with  two  medium-sized,  real-world  sim¬ 
ulator  programs.  There  are  three  aspects  of  our  study.  First,  we  eval¬ 
uate  our  representation’s  ability  to  suppress  m2tny  common  forms  of 
program  variation  which  hinder  recognition.  Second,  we  investigate 
the  expressiveness  of  our  graph  grammar  formalism  for  capturing  pro¬ 
gramming  cliches.  Third,  we  empirically  and  analytically  study  the 
computational  cost  of  our  recognition  approawrh  with  respect  to  the 
real-world  simtilator  programs. 
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Chapter  1 


Introduction 


Experienced  engineers  are  able  to  quickly  determine  the  behavior  and  properties  of  a  com¬ 
plex  device  by  recognizing  familiar,  standard  forms  in  its  design.  These  standard  forms, 
which  we  call  cliches  [110,  112,  115,  137,  117],  are  combinations  of  primitive  mechanisms 
which  engineers  use  frequently  because  the  combinations  have  been  found  useful  in  prac¬ 
tice.  From  experience,  the  engineers  have  come  to  expect  the  cliched  forms  to  exhibit  certain 
known  behaviors.  By  relying  on  this  “pre-compiled”  knowledge,  engineers  are  able  to  effi¬ 
ciently  understand  and  build  complex  devices  containing  cliched  components  without  always 
reasoning  from  first  principles.  Rich  [110,  112,  117]  has  developed  a  model  of  engineering 
problem  solving  in  which  synthesis  and  analysis  methods  are  based  on  the  recognition  and 
use  of  cliches.  He  calls  these  inspection  methods. 

This  report  deals  with  automating  the  recognition  of  cliches  in  computer  programs. 
Cliches  in  the  software  engineering  domain  are  stereotypical  algorithmic  computations  and 
data  structures.  Examples  of  algorithmic  cliches  are  list  enumeration,  binary  search,  and 
quick-sort.  Examples  of  data-structure  cliches  are  sorted  list,  priority  queue,  and  hash  table. 

Several  experiments  [58,  83,  128,  142]  give  empirical  data  supporting  the  psychological 
reality  of  cliches  and  their  role  in  understanding  programs.  In  trying  to  understand  a  pro¬ 
gram,  an  experienced  programmer  may  recognize  parts  of  the  program’s  design  by  identify¬ 
ing  cliched  computational  structures  in  the  code.  Knowing  how  these  structures  implement 
other  more  abstract  structures,  the  programmer  can  build  a  hierarchical  description  of  the 
program’s  design.  We  call  this  process  program  recognition.  Program  recognition  is  one 
technique,  among  several,  used  by  programmers  in  the  more  general  task  of  understanding 
programs. 


1.1  Motivations 

It  is  because  human  software  engineers  recognize  cliches  that  we  would  like  to  automate 
program  recognition.  This  gives  us  both  theoretical  and  practical  motivations. 

From  a  theoretical  standpoint,  automated  program  recognition  is  an  interesting  artificial 
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intelligence  problem.  It  is  an  ideal  task  for  studying  how  programming  knowledge  and 
experience  can  be  represented  and  used.  (However,  in  automating  program  recognition,  the 
goal  is  not  to  mimic  the  cognitive  process  used  by  programmers  to  recognize  cliches,  but 
to  mimic  only  the  use  of  experiential  knowledge  in  the  form  of  cliches  to  achieve  a  similar 
result  of  understanding  the  program.) 

Our  practical  motivation  stems  from  an  interest  in  building  automated  systems  that 
assist  software  engineers  with  tasks  requiring  program  understanding,  such  as  inspecting, 
maintaining,  and  reusing  software.  Such  collaboration  requires  that  the  automated  assistant 
be  able  to  communicate  with  engineers  in  the  same  way  as  they  communicate  with  each 
other  when  performing  these  tasks.  They  refer  to  instances  of  cliches  and  assume  knowledge 
of  their  well-known  properties  and  behaviors.  For  example,  they  might  discuss  changing  a 
program  from  using  an  ordered  associative  Unked  list  to  using  a  hash  table  to  gain  efficiency. 
They  discuss  the  change  at  a  high  level  of  abstraction  and  justify  their  design  decisions 
using  the  established  properties  of  the  cliches.  They  are  also  able  to  explain  the  design  of 
a  program  to  each  other  on  multiple  levels  of  abstraction.  They  can  convince  each  other  of 
the  properties  or  behavior  of  a  program  by  pointing  out  the  existence  of  cliches  in  its  design 
and  then  leveraging  off  the  accumulated  body  of  experience  surrounding  the  cliches.  The 
known  properties  of  the  cliches  are  used  directly,  rather  than  constructing  formal  proofs  or 
performing  formal  complexity  analyses  to  establish  that  the  properties  hold. 

If  an  automated  assistant  is  to  collaborate  with  human  engineers  in  the  same  way,  it 
must  share  the  same  knowledge  of  cliches  and  their  properties.  It  must  be  able  to  recognize 
instances  of  cliches,  without  requiring  the  human  engineer  to  explicitly  identify  and  locate 
them  in  a  program. 

This  recognition  ability  would  be  a  valuable  component  of  automated  software  tools 
and  assistants  that  perform  tasks  requiring  program  understanding.  They  would  be  able  to 
explain  their  understanding  of  the  program  in  terms  familiar  to  a  human  engineer.  They  can 
respond  to  requests  from  the  engineer  that  are  phrased  in  terms  of  abstract  computational 
structures  in  the  program,  rather  than  low-level  commands  that  spell  out  actions  to  be 
performed  on  language  primitives.  (For  example.  Waters’  KBBaacs  [116,  117,  139]  shows 
how  an  automated  assistant  can  aid  a  human  engineer  while  communicating  at  a  high-level 
of  abstraction.  In  KBEmacs,  tlus  model  is  constructed  as  the  program  is  being  built.  A  tool 
like  KBEmacs  can  be  used  to  maintain  existing  code  (not  written  with  the  help  of  KBEmacs), 
if  the  cliches  from  which  the  code  is  constructed  are  recognized.) 

Incorporating  an  automated  recognition  system  into  software  tools  and  assistants  yields 
more  than  just  communications  benefits  for  human-computer  interaction.  By  mimicking  the 
human  engineer’s  “short-cut”  to  understanding  a  program’s  design,  an  automated  recogni¬ 
tion  system  provides  an  efficient  way  to  reconstruct  design  information.  It  bypasses  complex 
reasoning  about  how  behaviors  and  properties  arise  from  a  certain  combination  of  language 
primitives.  The  behaviors  and  properties  can  be  used  directly  by  these  tools. 

Collaboration  between  a  person  and  an  automated  recognition  system  is  mutually  ben- 


eficial.  An  automated  recognition  system  provides  capabilities  which  complement  the  per¬ 
son's  abilities.  An  automated  system  has  significantly  better  memory  capabilities  than  a 
person.  These  are  valuable  in  maintaining  multiple  possible  views  of  the  program  and  in 
keeping  track  of  details  about  what  has  been  found  so  far.  Also,  some  cliches  may  be  easier 
for  the  computer  to  recognize  because  they  are  hidden  or  delocalized  in  the  textual  code 
representation,  but  are  localized  in  the  computer’s  internal  representation. 

On  the  other  hand,  people  have  some  capabilities  that  can  greatly  aid  the  recognition 
system.  They  may  have  access  to  many  different  sources  of  knowledge  about  the  program, 
beyond  the  source  code,  including  its  goals  or  specification,  documentation,  comments, 
execution  traces,  a  model  of  the  problem  domain,  and  typical  properties  of  the  program’s 
inputs  and  outputs.  Even  though  some  of  this  information  can  be  incomplete  and  inaccurate, 
it  provides  an  important  independent  source  of  expectations  about  a  program’s  purpose 
and  design.  These  expectations  can  be  used  to  guide  the  recognition  system  by  focusing  its 
search  on  particular  parts  of  a  program  for  particular  cliches. 

The  person  can  also  provide  information  not  easily  recoverable  from  the  code  which  can 
help  the  recognition  system  to  recognize  more  of  the  program.  For  example,  the  person 
can  undo  an  optimization  that  takes  advantage  of  an  opportune  dataflow  equality.  This 
may  uncover  a  dataflow  dependency  that  must  exist  for  a  particular  cliche  to  be  recognized. 
(More  concrete  instances  of  the  type  of  information  that  can  help  push  the  recognition  of 
some  cliches  through  are  described  in  Section  5.2.) 

Automated  tools  are  also  being  developed  to  aid  the  human  engineer  in  extracting 
design  information  and  generating  expectations  from  many  different  sources  in  addition  to 
the  code.  An  exemplary  system  is  DESIRE,  which  is  being  developed  by  Biggerstaff  [12,  13]. 
A  central  part  of  DESIRE  is  a  rich  domain  model,  which  contains  machine-processable 
forms  of  design  expectations  for  a  particular  domain  as  well  as  informal  semantic  concepts. 
It  includes  typical  module  breakdowns  and  typical  terminology  associated  with  programs 
in  a  particular  problem  domain.  Techniques  for  recognizing  patterns  of  organization  and 
linguistic  idioms  in  the  program  are  being  developed  to  generate  expectations  of  the  typical 
concepts  associated  with  these  patterns.  These  expectations  can  be  used  to  quickly  draw 
attention  to  sections  of  the  program  wher**  there  may  be  cliches  related  to  a  particular 
concept  in  the  domain. 

Other,  more  conventional  techniques  for  reverse  en^neering  le^^e  programs  have  focused 
on  extracting  a  given  system’s  module  structure.  This  is  typically  aone  by  using  clustering 
[62]  and  slicing  [59,  140,  141]  techniques,  which  bring  together  parts  of  a  program  based  on 
identifier  and  procedure  names,  data  dependencies,  and  call  relationships,  among  other  fea¬ 
tures  [13,  19,  46,  51,  56,  123,  124,  143].  Programming  and  maintenance  environments,  such 
as  MicroScope  [7],  Cleveland’s  system  [20],  and  Marvel  [66],  provide  tools  for  performing 
various  types  of  dependency,  dynamic,  and  impact  analyses  and  for  browsing  the  results  in 
the  form  of  call  graphs,  dataflow  graphs,  execution  histories,  and  program  sUces. 

These  techniques  and  environments  can  contribute  to  a  user’s  understanding  of  a  pro- 
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gram.  While  they  alone  do  not  provide  a  deep  understanding,  they  extract  information  that 
can  help  a  person  generate  advice  and  expectations.  Based  on  these,  the  person  can  guide 
an  automated  recognition  system,  so  that  a  deeper  understanding  may  be  obtained.  The 
results  of  recognition  can  in  turn  enhance  the  capabilities  of  these  automated  techniques 
by  providing  a  more  abstract  view  of  a  program.  For  example,  dependencies  between  more 
abstract  data  objects  can  be  computed  and  used  to  create  more  abstract  clusters. 


1.2  Toward  a  Hybrid  Program  Understanding  System 

Because  program  understanding  requires  many  different  techniques  besides  program  recog¬ 
nition,  and  draws  upon  various  sources  of  knowledge  besides  the  code,  program  under¬ 
standing  systems  of  the  future  will  be  hybrid  systems.  They  will  integrate  many  different 
special-purpose  components  for  extracting  design  information  from  a  program  and  its  asso¬ 
ciated  documentation,  domain  model,  etc.  The  components  will  communicate  with  human 
engineers,  who  can  provide  additional  guidance  and  information. 

The  benefits  of  such  co-operation  between  specialists  in  solving  complex  problems  that 
require  several,  diverse  types  of  knowledge  are  well  known.  For  example,  research  in  black¬ 
board  architectures  [37,  63,  99]  and  hybrid  knowledge  representation  systems  [113]  study 
ways  of  achieving  co-operative  problem  solving. 

Figure  1-1  shows  a  model  of  a  hybrid  program  understanding  system.  It  is  roughly 
divided  into  two  complementary  processes:  expectation-driven  (top-down)  and  code-driven 
(bottom- up).  The  heuristic  top-down  process  uses  knowledge  such  as  the  program’s  goals, 
domain  model,  and  documentation  to  generate  expectations  about  the  program’s  design. 
These  can  be  used  to  guide  the  code-driven  process,  which  can  confirm,  amend,  or  reject 
them  by  checking  them  against  the  code. 

Since  there  are  many  different  types  of  things  an  engineer  or  application  tool  might 
wish  to  understand  about  a  program,  the  program  understanding  system  can  be  directed 
by  specific  questions  from  the  engineer  or  application. 

The  details  of  this  hybrid  system  have  not  yet  been  fleshed  out.  We  believe  that  a 
key  part  of  the  code-driven  component  is  an  automated  recognition  system.  The  labels  on 
the  communication  links  between  the  expectation-driven  and  code-driven  components  are 
useful  inputs  and  outputs  to  a  code-driven  system  based  on  recognition.  However,  these  do 
not  entirely  specify  the  communication  between,  or  the  nature  of,  these  components.  Also, 
the  diagram  is  not  meant  to  imply  that  all  the  techniques  integrated  into  the  hybrid  system 
are  either  solely  code-driven  or  expectation-driven.  Some  may  themselves  be  hybrids. 

Some  of  the  questions  that  must  be  answered  in  the  design  of  such  a  hybrid  system 
are  what  techniques  should  be  incorporated  and  what  is  the  appropriate  division  of  labor 
between  them?  There  are  also  managerial  problems  in  the  co-ordination  of  techniques  and 
the  integration  of  different  types  of  knowledge  and  representations  [93]. 

Determining  which  techniques  to  incorporate  and  what  their  individual  responsibilities 
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Figure  1-1:  A  hybrid  program  understanding  system. 


are  requires  analyzing  the  candidate  techniques  to  determine  their  relative  strengths,  limi¬ 
tations,  and  computational  expense.  Our  research  takes  a  step  toward  the  long-term  goal 
of  a  hybrid  program  understanding  system  by  exploring  the  strengths  and  weaknesses  of  a 
particular  program  recognition  technique. 

In  particular,  we  develop  and  study  a  graph  parsing  approach  to  program  recognition. 
This  approach  represents  the  program  in  a  dataflow  graph  representation  and  the  cliche 
library  in  a  graph  grammar  and  then  uses  graph  parsing  to  recognize  cliches  in  the  code. 
The  grammar  rules  capture  implementation  relationships  between  the  cliches.  The  parsing 
technique  yields  a  hierarchical  description  of  a  plausible  design  of  the  program  in  the  form 
of  derivation  trees  specifying  the  cliches  found  and  their  relationships  to  each  other. 

We  demonstrate  that  the  flow  graph  parsing  approach  is  a  feasible  and  useful  way  to 
automate  program  recognition.  We  also  identify  its  shortcomings.  This  information  will 
help  us  to  make  the  appropriate  division  of  labor  between  the  integrated  components  of  the 
hybrid  program  understanding  system. 

To  do  this,  we  developed  an  experimental  system  that  performs  recognition  on  realistic, 
medium-sized  programs.  Given  a  program  and  a  library  of  cliches,  it  flnds  all  occurrences  of 
the  cliches  in  the  program  and  builds  a  hierarchical  description  of  the  program  in  terms  of 
the  cliches  found.  (In  general,  there  may  be  several  such  descriptions.)  We  call  our  system 
GRASPR,  which  stands  for  “GRAph-based  System  for  Program  Recognition.” 


1.3  What  is  Involved  in  Automating  Program  Recognition? 

To  automatically  recognize  interesting  cliches  in  real-world  programs,  a  number  of  issues 
must  be  addressed.  This  section  discusses  the  key  issues. 

What  are  the  cliches?  We  must  identify  the  cliches  that  programmers  use.  These 
include  both  general  programming  cliches  that  most  programmers  use  (e.g.,  those  found  in 
textbooks  on  programming  [3,  21,  76])  and  domain-speciflc  cliches  that  are  used  to  solve 
particular  problems.  For  the  results  of  recognition  to  be  useful,  we  also  need  to  collect 
the  information  that  is  associated  with  each  cliche,  such  as  its  behavior,  pre-  and  post¬ 
conditions,  complexity,  and  common  design  rationale  for  choosing  it.  In  general,  cliche 
library  acquisition  requires  domain  modeling,  which  is  itself  an  entire  area  of  active  research 
[106]. 

How  are  cliches  and  programs  encoded?  Once  cliches  are  identified,  they  must  be  ex¬ 
pressed  in  a  machine-manipulable  form  which  makes  relationships  between  the  cliches  ex¬ 
plicit.  To  facilitate  recognition,  the  representation  of  cliches  and  programs  should  suppress 
details  that  obscure  the  similarity  between  two  instances  of  the  same  cliche.  A  negative 
example  is  a  textual  representation  of  cliches  and  programs.  The  program  text  contains 
details  about  how  data  and  control  flow  is  achieved  in  terms  of  programming  language 
constructs.  This  introduces  syntactic  variation  across  programs  that  achieve  the  same  data 
and  control  flow  but  use  different  constructs  or  different  programming  languages.  Other 
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types  of  variation  besides  syntactic  include  variations  in  the  implementations  of  some  ab¬ 
stract  cliche,  the  organization  of  components,  the  amount  of  redundant  computation,  and 
the  contiguousness  (or  localization)  of  cliches.  These  are  described  further  in  Sections  2.3.1, 
5.1,  and  5.2.  The  representation  should  remove  as  much  variation  as  possible  between  two 
instances  of  the  same  cliche. 

How  are  cliches  recognized  efficiently?  The  recognition  technique  must  deal  with  vari¬ 
ation,  allow  partial  recognition  of  a  program,  and  have  a  flexible  control  strategy.  To  deal 
with  the  variation  that  the  chosen  representation  cannot  eliminate,  the  recognition  tech¬ 
nique  might  view  the  program  in  multiple  ways  and  at  several  levels  of  abstraction,  or 
introduce  transformations  to  reveal  the  similarities  between  programs  and  cliches. 

In  addition  to  dealing  with  variation,  the  recognition  technique  should  allow  partial 
recognition  of  the  program,  since  programs  are  rarely  constructed  entirely  of  cliches.  Unfa¬ 
miliar  parts  of  the  program  must  not  deter  recognition  of  the  familiar  parts. 

Finally,  the  recognition  technique  should  have  a  flexible  control  strategy,  particularly  if 
it  is  expected  to  interact  with  other  components  in  a  hybrid  system.  There  may  be  a  range 
of  possible  inputs  to  the  recognition  system  as  well  as  a  variety  of  types  of  outputs  desired 
from  it.  The  types  of  inputs  to  the  recognition  system  that  tend  to  vary  are  the  advice  given 
to  guide  the  search  for  cliches  and  the  expectations  and  hypotheses  generated  from  external 
knowledge  sources.  These  vary  depending  on  the  amount  of  information  that  already  exists 
about  the  program  and  its  development  (e.g.,  in  its  associated  documentation).  The  input 
also  changes  as  the  recognition  system  and  expectation-driven  components  interact.  The 
task  to  which  recognition  is  being  applied  also  affects  the  type  of  information  available 
as  input.  For  example,  in  debugging,  verification,  or  program  tutoring  applications,  a 
specification  of  the  program  is  often  avsdlable  from  which  strong  guidance  can  be  generated, 
while  this  information  is  often  lacking  in  maintaining  old  code. 

The  application  task  can  also  place  restrictions  on  the  time  and  space  allotted  to  the 
recognition  system.  For  example,  a  real-time  response  may  be  required  of  the  system  if  a 
person  is  using  it  interactively  as  an  assistant  in  maintaining  code.  In  this  situation,  it  may 
be  more  desirable  to  quickly  recognize  cliches  that  are  more  ‘‘obvious”  rather  than  spending 
more  time  to  uncover  cliches  that  are  more  hidden  (e.g.,  by  an  optimization  which  must  be 
undone  for  them  to  be  revealed).  It  should  be  possible  to  prioritize  the  search  for  certain 
cliches,  so  that  obvious  ones  are  recognized  early,  while  still  reserving  a  “try  harder”  phase 
in  which  the  more  hidden  cliches  can  be  found.  This  allows  us  to  gain  efficiency  without 
permanently  sacrificing  completeness. 

Not  only  is  it  important  that  the  recognition  system  be  responsive  to  directions  and 
additional  information  besides  the  code,  it  must  have  a  control  strategy  that  is  flexible 
enough  to  perform  a  variety  of  recognition  tasks.  There  are  many  reasons  a  human  engineer 
or  some  application  tool  may  want  recognition  to  be  performed,  since  they  typically  want 
to  understand  many  different  things  about  a  program.  The  recognition  task  depends  on 
what  needs  to  be  understood.  For  example,  if  the  recognition  system  is  going  to  be  applied 
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to  verification,  it  can  use  a  strategy  that  finds  any  complete  recognition  of  the  program. 
On  the  other  hand,  if  it  were  applied  to  documentation  generation,  it  would  be  better  for 
it  to  produce  all  possible  full,  as  well  as  partial,  analyses.  For  applications  in  which  near- 
misses  of  cliches  should  be  recognized,  such  as  debugging,  the  best  partial  analysis  might 
be  desired.  A  flexible  control  strategy  is  needed  that  can  be  tailored  to  a  variety  of  different 
recognition  tasks. 

To  summarize,  the  main  issues  in  automating  recognition  are:  acquiring  the  cliche  li¬ 
brary,  choosing  a  representation  3;."'d  efficient  technique  that  tolerates  variation,  and  provid¬ 
ing  a  flexible  control  strategy.  This  report  deals  primarily  with  the  problems  of  tolerating 
variation  and  providing  a  flexible,  efficient  recognition  technique.  It  deals  secondarily  with 
the  cliche  acquisition  problem  by  discussing  experiences  in  manually  acquiring  our  cliche 
library.  It  does  not  discuss  aids  for  acquisition. 


1.4  Graph  Parsing  Approach 

There  are  two  key  aspects  of  our  approach. 

1.  Representation  shift:  Instead  of  looking  for  cliches  directly  in  the  source  code,  GRASPR 
translates  the  program  and  cliches  into  a  language-independent,  graphical  representa¬ 
tion.  The  cliches  and  the  relationships  between  them  are  encoded  in  graph  grammar 
rules. 

2.  Flexible  recognition  architecture:  Recognition  is  achieved  by  parsing  the  program’s 
graphical  representation  in  accordance  with  the  graph  grammar  encoding  of  the 
cliches.  A  chart  parsing  algorithm  is  used  which  makes  search  and  control  strategies 
explicit,  enabling  them  to  accept  advice  and  additional  information  from  external 
agents. 

Figure  1-2  shows  GRASPR’s  architecture.  In  keeping  with  the  bottom-up  nature  of  the 
recognition  process,  the  figure  shows  the  program  and  cliche  library  inputs  at  the  bottom 
and  the  more  abstract  results  of  recognition  at  the  top.  The  recognition  process  is  to  be 
read  upward.  This  also  makes  it  easier  to  see  how  GRASPR  fits  into  the  hybrid  system  shown 
in  Figure  1-1. 

GRASPR  translates  the  program  into  a  flow  graph,  which  is  a  restricted  type  of  directed 
acyclic  graph  (as  is  described  in  Section  3).  Basically,  the  graph  represents  operations  in  its 
nodes  and  dataflow  dependencies  between  them  in  its  edges.  It  is  annotated  with  attributes 
which  represent  additional  information  about  the  program,  for  example,  its  control  flow. 

A  program  is  translated  into  an  attributed  flow  graph  in  two  steps.  The  first  step  per¬ 
forms  a  data  and  control  flow  analysis  of  the  program  to  yield  a  Plan  Calculus  representation 
of  it.  The  Plan  Calculus  is  a  program  representation  developed  by  Rich,  Shrobe,  and  Wa¬ 
ters  [110,  111,  112,  117,  127,  137]  in  which  a  program  is  captured  in  an  annotated  directed 
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Figure  1-2:  GRASPR’s  architecture. 


graph,  called  a  plan.  The  structure  of  this  graph  explicitly  captures  both  data  and  control 
flow,  as  well  as  aggregate  data  structure  accessors  and  constructors,  and  recursion.  The 
second  step  of  the  translation  encodes  the  plan  in  an  attributed  flow  graph  representation. 

The  Plan  Calculus  is  used  as  a  stepping  stone  in  the  translation  of  the  program  to 
an  attributed  flow  graph.  The  main  reason  the  program  is  not  translated  directly  to  the 
flow  graph  is  that  the  attributes  are  easier  to  compute  from  the  plan  than  to  generate  in 
one  shot  during  the  data  and  control  flow  analysis.  A  secondary  reason  is  that  GRASPR 
is  intended  as  one  component  of  an  intelligent  software  engineering  assistant,  called  the 
Programmer’s  Apprentice  (PA)  [117].  By  being  able  to  encode  plans  in  its  internal  flow 
graph  representation,  GRASPR  can  more  easily  interface  to  other  components  of  the  PA, 
which  all  share  the  Plan  Calculus  representation. 

The  Plan  Calculus  is  also  a  representation  that  has  been  found  useful  in  representing  the 
cliche  library.  It  allows  relationships  between  cliches  to  be  captured  in  the  form  of  overlays. 
These  represent  the  knowledge  that  an  instance  of  one  cliche  can  be  viewed  as  an  instance 
of  another  (e.g.,  a  speciflcation  cliche  and  an  implementation  cliche). 

CUches  are  translated  from  a  Plan  Calculus  representation  to  an  attributed  flow  graph 
grammar  by  a  process  similar  to  the  encoding  of  plans  in  attributed  flow  graphs.  The  gram¬ 
mar  rules  encode  the  relationships  specified  in  overlays.  Each  rule  also  places  constraints 
on  the  attributes  of  any  flow  graph  structurally  matching  the  rule’s  right-hand  side.  These 
constraints  explicitly  encode  the  variations  that  are  allowed  in  the  values  of  attributes  in 
cliche  instances. 

Once  the  program  and  cliche  library  are  encoded  in  an  attributed  flow  graph  and  flow 
graph  grammar,  recognition  is  achieved  by  parsing  the  flow  graph  in  accordance  with  the 
grammar.  Constrmnt  checking  is  interleaved  with  parsing  for  efficiency  (as  described  in 
Sections  3.2.3  and  6.2.2).  Essentially,  graph  parsing  matches  the  dataflow  structure  of  cliches 
and  constraint  checking  deals  with  the  other  details  of  cliches  that  cannot  be  represented 
in  the  graph  structure  or  are  sources  of  too  much  variation  if  graphically  represented. 

Parsing  yields  hierarchical  descriptions  of  the  program’s  design  in  the  form  of  the  possible 
derivations  of  the  program’s  flow  graph  from  the  flow  graph  grammar  that  was  extracted 
from  the  clich4  library.  These  are  called  design  trees. 

By  shifting  the  representation  of  programs  and  cliches  from  text  to  a  flow  graph,  GRASPR 
is  able  to  overcome  many  of  the  difficulties  of  syntactic  variation  and  noncontiguousness. 
It  abstracts  away  the  syntactic  features  of  the  code,  exposing  the  program’s  algorithmic 
structure.  It  concisely  captures  the  data  and  control  flow  of  programs,  independent  of  the 
languf^e  in  which  they  are  written.  Also,  many  cliches  that  are  delocalized  in  the  program 
text  are  much  more  localized  in  the  flow  graph  representation. 

The  graph  grammar  captures  relationships  between  cliches  so  that  the  results  of  recog¬ 
nition  can  be  given  on  multiple  levels  of  abstraction.  Grammar  rules  relate  abstract  cliches 
to  their  implementations.  This  enables  GRASPR  to  deal  with  implementation  variation:  two 
implementation  dich4s  can  be  recognized  as  the  same  abstract  cliche.  The  grammar  also 
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captures  commonalities  between  cliches  so  that  large  numbers  of  cliches  can  be  encoded 
more  compactly. 

In  using  a  graph  parsing  approach,  we  are  not  trying  to  mimic  the  recognition  process 
of  human  programmers.  No  claim  is  being  made  that  formal  parsing  is  a  psychologically 
valid  model  of  how  programmers  understand  existing  programs.  For  the  present  work,  a 
grammar  is  simply  a  useful  way  to  encode  the  programmer’s  experiential  knowledge  about 
programming  so  that  parsing  can  be  used  for  program  recognition. 

1.5  Goals  and  Contributions 

The  goal  of  this  research  is  to  show  that  graph  parsing  is  a  good  computational  model 
for  automating  program  recognition,  and  to  identify  its  capabilities  and  limitations.  We 
demonstrate  the  following: 

•  We  can  encode  many  interesting  programming  cliches  and  the  relationships  between 
them  in  a  flow  graph  grammar. 

•  The  flow  graph  formalism  provides  an  effective  representation  for  tolerating  many 
classes  of  variation. 

•  Flow  graph  parsing  can  be  used  to  recognize  the  cliche.  The  derivation  trees  that 
result  provide  a  useful  hierarchical  description  of  the  program,  over  multiple  levels  of 
abstraction. 

•  Limitations  in  the  power  of  the  recognition  system  to  recognize  certaun  cliches  can  be 
alleviated  by  accepting  additional  design  information  from  an  external  agent  (such  as 
a  person),  who  is  interacting  with  it. 

•  Recognition  by  flow  graph  parsing  can  be  performed  efliciently  in  real-world  situations. 

•  The  complexity  of  the  recognition  process  can  be  controlled  if  the  parser’s  control 
strategy  is  sufficiently  flexible  and  responsive  to  advice  from  an  external  a^ent. 

We  show  these  things  by  experimenting  with  read-world  program  examples,  which  aue 
medium-sized  (in  the  500  to  1000  line  range)  simulation  programs  written  in  Common  Lisp 
by  members  of  a  parallel-processing  research  group  at  MIT.  (Section  2.2  describes  them 
further.)  We  atre  able  to  express  both  general  programming  cliches  and  cliches  from  the 
simulation  domadn  in  a  flow  graph  grammau*.  GRASPR  recognizes  these  cliches  in  the  exaunple 
programs  efficiently. 

Our  experimentation  also  reveals  shortcomings  in  our  graph  parsing  approaudi.  Many 
of  the  limitations  cam  be  compensated  for  by  other  techniques  and  by  using  other  sources 
of  knowledge  which  may  be  avadlable  in  the  context  of  a  hybrid  program  understanding 
system. 
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The  specific  contributions  of  this  research  are  the  following.  (This  Ust  includes  brief 
statements  of  how  these  contributions  advance  the  state-of-the-art  of  recognition  research. 
More  details  on  related  research  are  given  in  Section  7.3.) 

•  We  develop  and  use  a  flow  graph  grammar  formalism  in  which  programs  and  cliches 
can  be  concisely  represented  so  that  much  variation  is  eliminated  and  relationships 
between  cliches  are  encoded. 

This  graph-based  representation  has  significant  advantages  over  the  text-based  rep¬ 
resentations  used  by  many  other  recognition  systems,  particularly  in  dealing  with 
syntactic  variation. 

•  We  present  a  recognition  architecture  with  a  general,  flexible  control  structure  that  can 
accept  advice  and  guidance  from  external  agents.  The  trade-off  between  recognition 
power  and  computational  expense  can  be  explicitly  controUed  so  that  some  cliches  are 
recognized  quickly,  while  other  more  expensive  recognitions  are  postponed  to  a  “try- 
harder”  phase.  The  algorithm  exhaustively  finds  all  possible  recognitions  of  cliches  and 
can  generate  multiple  views  of  a  program  as  well  as  partial  “near-miss”  recognitions. 
This  architecture  forms  a  seed  for  a  hybrid  program  understanding  system. 

Other  recognition  systems  are  committed  to  a  rigid  (often  ad  hoc)  control  strategy. 
Most  search  for  a  single  best  interpretation  of  the  program,  while  permanently  cutting 
off  alternatives.  They  often  build  heuristics  into  the  system  for  controlling  cost  that 
are  chosen  on  a  trial-and-error  basis.  They  cannot  try  harder  later  to  incrementally 
increase  their  power.  They  also  cannot  generate  mxiltiple  views  of  the  program  when 
desired,  nor  provide  partial  information  when  only  near-misses  of  cliches  are  present. 

Some  recognition  techniques  can  use  information  obtained  from  one  or  two  other 
techniques  (e.g.,  theorem  proving  or  dynamic  analysis  of  program  executions)  with 
which  they  are  integrated.  Many  recognition  techniques  also  take  information  about 
the  goals  and  purpose  of  the  program  (in  the  form  of  a  specification  or  model  program). 
While  these  techniques  show  the  utility  of  these  additional  sources  of  information,  they 
rely  on  this  information  being  given  as  input,  rather  than  accepting  it  and  responding 
to  it  if  it  becomes  avmlable. 

•  We  analyze  the  graph  parsing  approach  to  program  recognition  to  determine  how  it 
would  fit  into  the  context  of  a  hybrid  program  understanding  system. 

We  address  the  questions: 

-  What  types  of  variations  is  the  technique  robust  under?  What  types  of  variations 
are  a  problem.  What  other  techniques  must  be  used  to  remove  the  variation? 

-  Are  graph  grammars  expressiveness  enough  to  encode  programming  cliches? 

-  Is  the  technique  feasible  for  large  programs?  How  can  the  cost  be  controlled? 
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The  observations  we  make  in  this  analysis  are  based  on  our  experiences  in  applying 
GRASPR  to  the  recognition  of  two  example  programs.  They  do  not  represent  com¬ 
plete  lists  of  the  capabilities  and  limitations  of  the  graph  parsing  approach.  Farther 
experimentation  is  needed  with  more  programs  and  in  multiple  problem  domains. 

Much  of  the  early  work  in  program  recognition  provides  no  analysis  of  the  represen¬ 
tations  or  techniques  used.  More  recent  research  includes  some  empirical  analysis, 
typically  studying  the  accuracy  of  recognition  and  the  recognition  rates  over  sets  of 
programs  (usually  student  programs  in  program  tutoring  applications).  With  the 
exception  of  Hartman’s  work  [55],  discussions  of  limitations  have  focused  mainly  on 
practical  implementational  limitations,  rather  than  on  general  limitations  of  the  ap¬ 
proach.  They  also  do  not  describe  how  additional  information  or  guidance  can  help. 

•  Our  recognition  system  is  able  to  recognize  programs  and  cliches  containing  a  wide 
range  of  types  of  program  features.  In  particidar,  it  is  able  to  represent  and  recognize 
programs  that  contsdn  conditionals,  loops  with  any  number  of  exits,  recursion,  ag¬ 
gregate  data  structures,  and  simple  side  effects  due  to  assignments.  (Suggestions  for 
future  work  in  dealing  with  side  effects  to  mutable  data  structures  are  given  in  Sec¬ 
tion  7.2.4.)  This  allows  GRASPR  to  recognize  larger  programs  than  existing  recognition 
systems.  It  also  enables  encoding  and  recognition  of  domain-specific  cliches  as  well  as 
general-purpose  ones,  since  many  domaun-specific  cliches  are  aggregate  data  structure 
cliches.  This  aUows  empirical  study  of  our  recognition  technique  on  programs  that 
are  not  contrived  nor  biased  toward  our  work. 

With  the  exception  of  CPU  [84],  existing  recognition  systems  cannot  handle  aggregate 
data  structure  cliches  and  a  majority  do  not  handle  recursion.  Talus  [95]  heuristically 
handles  some  side  effects  to  lists  and  arrays.  The  largest  program  recognized  by  any 
existing  recognition  system  is  a  300-iine  database  program  recognized  by  CPU.  All 
other  systems  work  with  programs  on  the  order  of  tens  of  lines.  None  deal  with 
domain-specific  cliches,  except  Laubsch’s  system  [81,  82]. 

•  A  secondary  contribution  is  a  graph  parsing  algorithm  which  is  an  extension  of  the 
parsers  of  Lutz  [90]  and  Brotsky  [15]  to  handle  a  wider  class  of  graph  grammars.  In 
particular,  it  is  able  to  parse  graph  grammars  that  encode  aggregation,  which  hierar¬ 
chically  groups  graph  edges,  not  just  nodes.  This  algorithm  has  potential  applications 
in  areas  other  than  program  recognition,  e.g.,  circuit  verification  and  plan  recognition. 
Section  7.2  discusses  some  applications. 

•  We  do  not  contribute  automated  aids  to  the  acquisition  of  the  cliche  library.  However, 
we  do  discuss  our  experiences  in  manually  acquiring  the  cliches. 

This  type  of  discussion  has  not  appeared  in  any  other  work  on  program  recognition 
of  which  we  are  aware. 
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1.6  Outline  of  Report 


Chapter  2  describes  the  cliche  library  and  our  experiences  in  acquiring  it.  It  also  demon¬ 
strates  GRASPR’s  recognition  of  these  cliches  in  the  example  simulation  programs.  Chapter  3 
describes  the  flow  graph  formalism  which  forms  the  basis  of  our  representation  shift.  It  also 
presents  a  flow  graph  chart  parsing  algorithm,  which  provides  a  flexible  recognition  control 
strategy.  It  includes  a  summary  of  related  work  in  the  general  area  of  graph  grammar 
formalisms.  Chapter  4  gives  details  of  issues  that  arise  in  applying  flow  graph  parsing  to 
program  recognition  and  how  GRiSPR  solves  them.  Chapter  5  discusses  the  capabilities  and 
limitations  of  the  parsing  approach  in  terms  of  the  variations  tolerated,  and  the  expressive¬ 
ness  of  flow  graph  grammars.  Chapter  6  studies  the  computational  cost  of  our  approach, 
both  empirically  and  analytically.  Finally,  Chapter  7  concludes  with  a  summary  of  the 
strengths  and  weaknesses  of  the  parsing  approach,  ideas  for  future  work  (particularly  in  the 
context  of  a  hybrid  system),  and  a  brief  comparative  summary  of  related  work  in  program 
recognition. 
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Chapter  2 

The  Knowledge,  Program  Corpus, 
and  Recognition  Examples 


An  important  part  of  automating  program  recognition  is  codifying  the  knowledge  that 
experienced  programmers  use  to  recognize  programs.  This  knowledge  is  in  the  form  of 
algorithmic  and  data  structure  cliches.  It  includes  both  general-purpose  cliches  that  occur 
in  programs  over  all  problem  domains,  as  well  as  those  specific  to  a  particular  domain. 

Our  library  must  capture  and  express  these  cliches  at  a  level  of  abstraction  that  allows 
them  to  be  recognized  in  a  broad  range  of  programs.  The  ideal  is  that  the  cliches  be  concisely 
represented,  but  efficiently  recognized  in  many  forms.  Recognition  of  a  cliche  should  be 
immune  to  many  common  syntactic  and  implementational  variations.  For  example,  the 
same  cliches  should  be  recognized  in  programs  that  differ  only  in  which  syntactic  binding 
and  control  constructs  they  use  or  in  which  programming  languages  they  are  written.  Also, 
an  abstract  cliched  operation  that  exists  in  two  programs  should  be  recognized  in  both, 
even  if  the  programs  differ  in  which  standard  implementation  of  the  operation  is  used. 

This  chapter  discusses  the  cliches  we  have  captured  so  far  in  our  library.  It  also  describes 
the  corpus  of  programs  we  chose  on  which  to  base  both  our  cliche  acquisition  and  our 
empirical  study  of  recognition.  Finally,  it  gives  examples  of  the  capabilities  of  6RASPR  in 
recognizing  these  cliches  not  only  in  our  example  corpus,  but  also  in  a  range  of  variations 
of  them.  (Chapter  3  discusses  the  formalism  we  use  to  abstractly  and  concisely  capture 
our  cliches  to  make  this  possible.)  Our  examples  provide  both  a  demonstration  of  what  is 
feasible  as  well  as  motivation  for  our  formalism  and  recognition  technique. 

2.1  What  are  the  Cliches? 

Our  clich4  library  contains  a  core  set  of  general-purpose,  '^utility”  cliches,  along  with  a  set 
of  cliches  from  the  domain  of  sequential  simulation.  The  domain-specific  cliches  are  built  on 
top  of  the  core  utility  cliche  (i.e.,  they  use  utility  cliches  as  components  or  implementations). 

The  general-purpose  cliches  are  well-known,  widely  used  algorithms  and  data  structures. 
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such  as  those  described  in  introductory  computer  science  textbooks  (e.g.,  [3,  21,  76]).  They 
are  found  in  programs  across  all  problem  domains.  They  include  common  operations  on 
priority  queues,  hash  tables,  lists,  and  first-in-first-out  (FIFO)  queues,  as  well  as  basic 
iteration  cliches,  such  as  sequence  enumeration,  filtering,  accumulation,  and  counting. 

The  domain- specific  cliches  in  our  library  are  found  in  programs  that  sequentially  simu¬ 
late  parallel  systems.  More  specifically,  we  have  encoded  the  subset  of  common  algorithms 
and  data  structures  found  in  this  domain  that  are  used  to  sequentially  simulate  message¬ 
passing  parallel  systems. 

A  message-passing  system  contains  a  collection  of  processing  nodes  which  communicate 
with  each  other  via  messages.  Each  processing  node  contains  a  processor,  a  network  in¬ 
terface,  and  a  block  of  distributed  memory.  The  message-passing  system  takes  a  program 
in  the  form  of  a  set  of  message  handlers  and  a  starting  message.  The  program  begins  by 
sending  the  starting  message  to  its  destination  node.  The  node  executes  the  handler  for 
that  message’s  type.  In  addition  to  changing  the  state  of  the  node,  this  can  cause  the  node 
to  send  messages  to  other  nodes  (e.g.,  to  request  the  value  of  some  variable  or  to  delegate 
some  sub-tasks).  When  these  messages  are  handled  by  their  destination  nodes,  additional 
messages  might  be  sent. 

It  is  possible  for  a  message  to  be  received  by  a  node  while  it  is  handling  another  message. 
Therefore,  each  node  has  a  local  buffer  which  accumulates  the  messages  received  while  the 
node  is  busy.  When  the  node  finishes  handling  a  message,  if  its  buffer  is  non-empty,  the 
node  puUs  a  message  from  the  buffer  and  handles  it.  The  buffer  is  emptied  in  FIFO  order. 
This  is  done  to  maintain  the  invariant  that  two  messages  received  by  the  same  node  must 
be  handled  in  the  order  in  which  they  are  received. 

The  behavior  just  described  is  simulated  by  the  programs  in  which  our  library’s  domain- 
specific  cliches  are  found.  This  is  a  subset  of  the  actual  behavior  of  a  real  mess^e-passing 
system,  which  also  includes  routing  messages  through  the  network,  for  example.  However, 
this  simplified  model  is  a  typical  one  simulated  in  parallel  architecture  research.  The  simu¬ 
lation  allows  statistics  to  be  gathered  on  such  properties  as  the  number  of  nodes  busy  over 
time  (a  measure  of  concurrency),  average  messs^e  execution  times,  and  average  message 
waiting  times. 

2.1.1  Simulation  Domain  Context 

It  is  instructive  to  see  how  the  domain  we  have  chosen  fits  into  the  larger  world  of  simulation 
programs.  It  is  a  subset  of  the  problem  domain  of  sequential  simulation,  as  opposed  to  par¬ 
allel  simulation,  of  parallel  systems.  Our  clich4  library  contains  only  sequential  algorithmic 
cliches. 

Within  the  domain  of  sequential  simulation,  there  are  two  types  of  simulators:  discrete- 
event  and  continuous.  Discrete-event  simulators  model  the  behavior  of  a  system  over  discrete 
points  in  time.  Continuous  simulators  model  behavior  that  is  characterized  by  state  that 
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changes  continuously.  (Continuous  simulators  typically  solve  a  set  of  differential  equations 
that  express  how  the  system’s  state  changes  over  time.  Continuous  simulation  is  used,  for 
example,  to  study  heat  dissipation  in  computer  systems.)  Our  simulation  cliches  are  found 
in  discrete-event  simulators.  The  discrete  points  in  time  at  which  a  message-passing  system 
can  be  modeled  are  when  a  message  is  sent,  received,  or  handled. 

Within  the  domain  of  discrete-event  sequential  simulation,  our  class  of  simulator  pro¬ 
grams  are  most  similar  to  simulators  that  model  queueing  systems  [91].  In  a  queueing 
system,  there  is  a  collection  of  one  or  more  servers  which  service  tokens  (sometimes  called 
“customers”).  There  is  a  notion  of  arrival  time  and  processing  time  of  tokens;  tokens  get 
buffered  in  a  queue  if  they  arrive  while  a  server  is  busy.  The  queueing  discipline  is  typically 
first-in,  first-out,  but  it  can  be  a  diflferent  one  if  tokens  need  not  be  serviced  in  the  order  in 
which  they  arrive.  A  common  real-world  situation  captured  by  the  queueing  system  model 
is  the  servicing  of  bank  customers  by  one  or  more  tellers,  where  the  customers  wait  in  a 
single  line. 

The  queueing  system  model  (using  a  FIFO  queueing  discipline)  is  similar  to  the  message¬ 
passing  multi-processor  model.  Servers  are  analogous  to  processing  nodes  and  servicing  a 
token  is  analogous  to  handling  a  message.  However,  there  are  two  key  differences.  One 
is  that  in  the  queueing  system,  servicing  a  token  does  not  create  new  tokens  which  feed 
back  to  the  servers.  In  the  message-passing  machine  model,  handling  a  message  can  cause 
new  messs^es  to  be  sent.  The  other  key  difference  is  that  in  the  queueing  system  model, 
the  waiting  tokens  are  not  targeted  for  a  particular  server  to  service.  Whichever  server  is 
idle  when  a  token  is  removed  from  the  queue  is  the  one  that  gets  the  job.  In  the  message¬ 
passing  model,  on  the  other  hand,  each  messaige  is  sent  to  a  particular  node  for  handling. 
The  message’s  destination  is  determined  when  the  message  is  sent.  Our  class  of  simulator 
programs  can  be  seen  as  modeUng  a  multi-queue  multi-server  system  with  feedback  (in 
which  tokens  are  targeted  for  particular  servers  and  servers  have  local  FIFO  queues  for 
buffering  tokens  when  the  server  is  busy). 

2.1.2  Informal  Cliche  Acquisition  Strategy 

In  acqmring  our  domain-specific  cliches,  we  used  an  informal  strategy.  (Developing  a  do¬ 
main  modeling  methodology  for  cliche  acquisition  is  beyond  the  scope  of  this  research.)  We 
worked  in  two  directions.  One  was  bottom  up  by  manually  understanding  two  program 
examples  in  our  domain.  (These  are  described  in  Section  2.2.)  This  allowed  us  to  identify 
concrete  computational  structures  that  were  used  in  the  simulators’  designs.  The  differences 
between  the  two  programs  in  implementing  the  same  high  level  operation  helped  us  to  gen¬ 
eralize  our  cliches.  The  similarities  between  the  programs  pointed  out  common  components 
that  some  cliches  shared.  We  were  fortunate  in  that  the  authors  of  the  programs  were  ac¬ 
cessible  for  answering  our  questions  about  the  design  of  the  programs.  Their  explanations 
helped  us  not  only  to  understand  the  programs,  but  also  to  identify  the  cliches,  since  the 
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authors  often  referred  to  algorithms  and  data  structures  that  they  considered  to  be  typical. 

Our  second  direction  was  top-down.  We  read  textbooks  in  the  area  of  simulation,  such 
as  [91,  151],  to  pick  up  the  vocabtilary  and  descriptions  of  typical  high-level  computational 
structures  that  are  used.  We  then  mapped  these  down  to  portions  of  the  example  programs 
that  embody  them. 

In  identifying  the  cliches  to  be  captured,  we  tried  to  identify  the  most  general  form  of 
each  cliche  and  then  express  it  in  a  way  that  canonicalized  specializations  of  it.  (This  was 
done  both  by  using  an  abstract  representation  and  by  providing  mechanisms  for  viewing 
specializations  as  the  more  general  form.)  However,  sometimes  this  canonicalization  was 
not  possible  and  we  needed  to  include  specializations  of  the  cliche  in  the  library  along  with 
the  generalized  forms.  In  these  cases,  we  relied  on  empirical  frequency  of  occurrence  of  the 
specialized  forms,  to  avoid  enumerating  all  possible  variations  (which  can  be  expensive  and 
incomplete). 

This  issue  came  up  most  frequently  in  trying  to  capture  cliched  operations  on  aggre¬ 
gate  data  structures.  We  encountered  three  distinguished  types  of  parts  of  aggregate  data 
structures: 

•  Primary  -  a  part  that  holds  a  piece  of  data  directly.  (For  example,  a  Hash  Table  data 
structure  contains  a  Buckets  part  which  is  usually  an  array). 

•  Handle  -  a  part  that  is  used  to  look  up  a  primary  part.  (For  example,  a  data  structure 
might  contain  a  primary  part  Node  that  represents  a  processing  node  or  it  might 
contain  an  integer  (an  identification  number)  that  is  used  to  index  into  another  data 
structure  to  retrieve  the  structure  representing  a  node.) 

•  Secondary  -  a  piece  of  data  that  is  an  unnecessary  part  of  a  data  structure  in  that  it 
can  be  computed  from  a  primary  part  or  a  handle  part  of  the  data  structure.  These  are 
usually  cached  values.  (For  example,  a  Circular-Indexed  Sequence  includes  a  sequence 
part,  and  two  indices  which  keep  track  of  the  bounds  on  the  fUled-in  portion  of  the 
sequence.  It  can  have  an  additional  secondary  part  which  keeps  a  running  count  of 
the  number  of  elements  in  the  Circular-Indexed  sequence.  This  part  is  unnecessary 
because  it  can  be  computed  from  the  size  of  the  sequence  and  the  boundary  indices.) 

If  we  were  to  capture  all  aggregate  data  cliches  in  their  general  form  -  as  aggregates 
of  only  primary  parts  -  we  would  have  trouble  recognizing  them  in  cases  where  handles 
are  used  and  in  cases  where  secondary  (cached)  parts  are  used  to  circumvent  computation 
performed  on  primary  parts.  So,  we  capture  these  specialized  forms,  but  only  if  they  are 
common.  That  is,  we  capture  data  cliches  that  are  common  optimizations  and  common 
uses  of  handles. 

Sometimes  an  optimization  of  some  generalized  cliche  is  possible  in  the  particular  context 
in  which  it  is  used,  but  this  optimization  is  not  a  common  one.  Perhaps  it  takes  advantage 
of  a  rare  alignment  with  other  cliches  or  of  opportune  dataflow  equalities.  Since  it  is  not 
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common,  it  is  not  in  the  cliche  library.  (Likewise  for  handles.)  Unless  we  can  undo  the 
optimization  or  use  of  a  handle,  the  recognition  of  the  cliche  wiU  be  hindered.  Section  5.1.5 
describes  a  class  of  common  optimizations  which  can  be  undone.  Sections  5.2.2  and  5.2.1 
discuss  some  optimizations  and  uses  of  handles  that  should  be  able  to  be  undone,  but  which 
require  advice  from  an  external  agent. 

2.1.3  Sequential  Simulation  Cliches 

There  are  two  common  designs  for  sequential  simulators  of  parallel  systems.  One  is  a 
synchronotis  simulation,  which  mimics  the  real  system  by  maintaining  a  global  clock  and 
simulating  the  actions  of  the  nodes  in  “lock-step.”  On  each  tick  of  the  clock,  the  simulator 
“advances”  each  node  by  simulating  what  the  node  would  do  in  the  real  system  on  that 
clock  tick.  In  this  type  of  simulation,  all  simulated  nodes  are  synchronized  to  the  global 
clock.  At  each  clock  tick,  the  state  of  the  simulated  nodes  gives  a  snapshot  of  the  state  of 
the  system  at  the  time  represented  by  the  clock  tick. 

The  other  common  sequential  simulator  design  is  event-driven.  In  this  type  of  simulator, 
there  is  an  agenda  of  events,  which  represent  work  to  be  done  by  the  nodes.  The  simulator 
iteratively  puUs  an  event  from  the  agenda  and  performs  the  work  associated  with  it.  This 
may  cause  new  events  to  be  generated,  which  are  added  to  the  agenda.  The  simulation  ends 
when  the  agenda  is  empty.  Unlike  in  synchronous  simulation,  the  actions  of  the  nodes  are 
simulated  asynchronously  rather  than  all  being  in  step  with  a  global  clock.  The  nodes  each 
keep  track  of  their  own  local  time,  which  is  updated  when  they  process  an  event. 

Our  clich4  library  contains  algorithmic  and  data  structure  cliches  that  make  up  the 
designs  of  event- driven  and  synchronous  simulators  for  message-passing  systems.  The  next 
two  sections  discuss  these  designs  and  the  cliches  from  which  they  are  constructed. 

A  Common  Synchronous  Simulation  Design 

A  common  design  used  in  synchronous  simulators  of  message-passing  systems  has  data 
structures  representing  processing  nodes  and  messages.  (In  this  discussion,  we  denote  the 
data  structure  representing  a  node  as  SYICH-IQDE  to  distinguish  it  from  the  real  processing 
node.  Similarly,  MESSAGE  denotes  the  data  structure  representing  a  real  message.)  Each 
SYICH-IODE  contains  a  Local-Buffer  pare,  whose  value  is  a  FIFO  queue  of  messages,  and  a 
Memory  part  which  represents  the  state  of  the  node  being  represented.  Each  MESSAGE  data 
structure  contains  a  Destination- Address  which  specifies  the  node  to  which  the  message  it 
represents  was  sent.  It  also  typically  contains  a  message  Type,  which  is  used  to  look  up  a 
handler  for  the  message.  Arguments  which  are  used  in  executing  the  handler,  and  Storage- 
Requirements  which  specify  how  much  local  memory  space  is  need  to  store  arguments  and 
locals  during  handler  execution. 

All  STICH-IODEs  are  collected  in  a  sequence,  called  an  ADDRESS-MAP,  which  maps  an  integer 
address  to  a  SYICH-IODE.  The  SYlCH-lODE  indexed  by  an  integer  t  is  the  one  representing  the 
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real  node  whose  address  is  t  in  the  machine  being  simulated.  A  global  buffer  of  MESSAGES  is 
also  maintained  to  help  model  message  delivery  delay,  as  is  explained  below. 

A  common  algorithm  used  for  synchronous  simulation  proceeds  as  follows.  The  simu¬ 
lation  is  begun  by  adding  a  “start”  MESSAGE,  which  is  given  as  input,  to  the  global  MESSAGE 
buffer.  On  each  iteration  of  the  simulation,  the  following  actions  are  taken. 

•  A  termination  condition  is  checked  and  if  satisfied,  the  simulation  stops.  This  condi¬ 
tion  is  that  the  global  MESSAGE  buffer  and  all  the  Local- Buffers  of  the  SYlCH-IODEs  are 
empty. 

•  The  MESSAGES  in  the  global  buffer  are  “delivered,”  which  means  each  is  placed  in  the 
Local-Buffer  of  the  SYICH-IODE  to  which  they  were  sent  (i.e.,  the  SYICH-IODE  in  the 
ADDRESS-MAP  indexed  by  the  MESSAGE’S  Destination- Address  part). 

•  Each  SYICH-IODE  is  polled  to  see  if  it  has  any  work  to  do,  i.e.,  if  it  has  any  MESSAGES  in 
its  Local-Buffer.  If  so,  a  MESSAGE  is  pulled  from  the  buffer  (maintaining  FIFO  order) 
and  handled.  If  any  new  MESSAGES  are  sent  as  a  result,  they  are  buffered  in  the  global 
MESSAGE  buffer. 

The  global  MESSAGE  buffer  is  used  to  ensure  that  delivery  delay  is  modeled.  Buffering  the 
MESSAGES  sent  during  a  clock  cycle  prevents  a  message  from  being  sent  and  handled  during 
the  same  cycle. 

The  invariant  that  messages  to  the  same  node  are  handled  in  the  order  in  which  they  are 
received  is  modeled  by  using  a  FIFO  queue  to  locally  buffer  the  MESSAGES  that  a  SYICH-IODE 
must  handle.  A  MESSAGE  will  not  be  handled  by  a  SYICH-IODE  until  all  the  MESSAGES  enqueued 
on  the  FIFO  queue  ahead  of  it  have  been  handled. 

What  it  means  for  a  MESSAGE  to  be  “handled”  (or  what  action  of  a  processing  node 
is  simulated)  by  the  simulator  varies  across  simulators.  It  depends  on  why  a  simolation 
is  being  performed  and  which  aspects  of  a  message-passing  system  are  of  interest.  For 
example,  some  simulators  might  want  to  simulate  the  message  handler  execution  on  the 
node  in  order  to  gather  statistics  about  operation  frequencies  or  average  message  execution 
time  on  eau:h  node.  Other  simulators  might  only  want  to  simulate  message  sends  that  result 
from  handler  execution,  in  order  to  gather  information  about  average  message  wanting  times, 
typical  size  of  buffers  needed,  and  the  number  of  nodes  busy.  In  addition,  the  set  of  message 
handling  actions  that  are  simulated  varies  over  the  machines  that  are  being  simulated.  The 
machine  architecture  of  a  real  node  determines  which  actions  it  performs;  only  these  can 
be  simulated. 

We  have  begun  to  identify  and  capture  some  cliches  in  the  area  of  simulating  node 
actions.  These  include  algorithms  for  looking  up  and  executing  message  handlers  as  weU 
as  cliches  found  in  the  domain  of  program  execution.  Below  we  discuss  the  cliches  we  have 
captured  so  far  and  Section  5.2  describes  the  difficulties  we  encountered  in  acquiring  them. 
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Although  we  have  identified  some  cliches  in  this  area,  it  is  unlikely  that  the  code  for 
simulating  the  actions  of  nodes  will  always  be  a  cliche.  There  is  a  wide  variety  of  reasons  to 
simulate  a  message-passing  system,  resulting  in  a  wide  range  of  node  behaviors  to  mimic. 
This  variation  is  reflected  in  the  diverse  code  responsible  for  simulating  a  node’s  actions. 
So,  we  also  look  at  the  issues  involved  when  an  integral  part  of  an  algorithmic  cliche  for 
synchronous  or  event-driven  simulation  may  be  filled  with  unfamiliar,  non-cliched  code.  It 
is  difficult  to  encode  such  a  cliche  in  a  flow  graph  grammar  so  that  it  can  be  recognized  by 
graph  parsing.  This  is  discussed  in  Sections  4.1.4  and  5.2.3. 

There  are  many  variations  of  the  algorithm  described  in  this  section  that  still  achieve 
synchronous  simulation.  For  example,  on  each  iteration,  our  algorithm  performs  three 
actions  in  the  following  order:  test  for  termination,  deliver  messages,  and  poll  and  advance 
nodes  by  one  step.  The  other  variations  of  this  algorithm  in  which  a  different  ordering  is 
used  also  perform  synchronous  simulation.  However,  the  current  cliche  library  contains  only 
the  one  given  above  as  an  algorithmic  cliche.  Section  5.2  discusses  the  problems  we  face  in 
trying  to  concisely  encode  and  recognize  the  other  variations. 

The  algorithm  and  data  structures  used  in  this  synchronous  simulation  design  are  cap¬ 
tured  in  our  cliche  library  as  cliches.  However,  the  cliches  are  not  flat  structures,  but  are 
hierarchically  built  out  of  other  cliches.  The  hierarchical  organization  allows  sharing  of 
common  sub-computations  among  cliches,  which  helps  us  avoid  redoing  work  during  recog¬ 
nition.  This  also  highlights  the  salient  characteristics  between  two  similar  cliches  which  is 
useful  in  controlling  recognition  cost  and  choosing  between  near-miss  recognitions  of  the 
cliches.  (However,  no  static  organization  can  do  this  perfectly,  since  saliency  is  relative.) 

Figure  2-1  shows  the  names  of  the  algorithmic  cliches  upon  which  the  Synchronous- 
Simulation  algorithmic  cliche  is  built.  Lines  connecting  the  names  indicate  relationships 
between  the  named  cliches.  (This  is  only  a  portion  of  the  cliche  library.  Figure  2-3  shows 
additional  algorithmic  cliches  used  in  a  common  event-driven  simulation  design  which  is 
described  in  the  next  section.  Also,  the  fringe  of  the  trees  in  Figures  2-1  and  2-3  contain 
the  names  of  general-purpose  cliches  and  small  triangles  to  indicate  that  the  sub-tree  of 
cliche  names  upon  which  they  are  built  is  not  shown.  Refer  to  Figure  2-5  for  these  cliche 
names  and  how  they  relate  to  the  other  general-purpose  cliches  in  the  library.)  Figure  2-2 
shows  the  aggregate  data  cliches  in  our  library  and  how  they  relate  to  each  other. 

The  trees  of  cliche  names  are  shown  only  to  give  a  flavor  of  the  structure  of  the  cliche 
library.  More  description  of  the  cliches  and  details  of  how  they  are  encoded  are  given  in 
Section  4.1. 

There  are  three  types  of  relationships  between  the  cliches  in  the  library.  One  type  of 
relationship  is  composition:  Cliches  may  contain  other  cliches  as  parts.  (This  relation  is 
shown  in  the  trees  of  Figures  2-1  and  2-2  as  a  set  of  branching  lines,  grouped  by  a  circular 
arc.  The  root  name  represents  a  clich4  that  is  composed  of  the  cliches  named  by  the 
branches.) 

For  example,  the  aggregate  data  structure  STICI-IODB  consists  of  two  parts,  a  Buffer  and 
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Figure  2-1:  Synchronous  simulation  cliches. 
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a  Memory,  each  of  which  is  another  cliche:  a  Queue  and  an  Associative  Set,  respectively. 
A  similar  relationship  can  occur  between  algorithmic  cliches.  The  algorithmic  cliche  of 
Synchronous  Simulation  using  a  Global  Message  Buffer  is  composed  of  three  other  cliches: 
Queue-Insert,  Generate-Global-Buffers-and-Nodes,  and  Earliest-Simulation-Finished. 

The  second  type  of  relationship  that  can  occur  between  two  cliches  is  an  implementa¬ 
tion  relationship:  A  cliche  may  implement  a  more  abstract  cliche.  For  example,  a  FIFO, 
Stack,  or  Priority  Queue  can  implement  a  Queue.  PoD-Nodes-and-Do-Work  is  an  imple¬ 
mentation  of  Advance-Nodes.  (Lines  between  cliche  names  in  Figures  2-1  and  2-2  that  are 
not  grouped  or  starred  represent  this  relationship.  Of  two  cliches  connected  by  a  line,  the 
upper  one  is  implemented  by  the  lower.  Branching  ungrouped  lines  represent  alternative 
implementations  of  the  root.) 

The  third  type  of  relationship  occurs  when  one  cliche  is  a  temporal  abstraction  of  an¬ 
other.  Temporal  abstraction  is  a  technique  developed  by  Waters  [117,  137, 138]  and  further 
extended  by  Rich  and  Shrobe  [110,  127],  in  which  a  clicked  fragment  of  iterative  computa¬ 
tion  is  viewed  more  abstractly  as  an  operation  on  a  sequence  of  values  -  the  sequence  of 
values  that  are  processed  over  time,  one  per  iteration.  For  example.  Sum  is  a  temporally 
abstract  operation  that  takes  a  sequence  of  numerical  values  and  produces  their  total.  This 
is  a  temporal  abstraction  of  a  loop  fragment  in  which  each  iteration  computes  the  sum  of 
a  new  value  and  the  result  of  the  sum  computed  on  the  previous  iteration.  The  temporal 
abstraction  of  this  fragment  views  the  sequence  of  new  values  accumulated  in  the  sum  as 
the  input  to  Sum.  (Lines  marked  with  an  asterisk  in  Figure  2-1  indicate  that  the  upper 
cliche  name  is  an  operation  that  temporally  abstracts  the  lower  iterative  cliche.)  In  Figure 
2-1,  Generate-Global-Buffers-and-Nodes  is  an  example  of  a  temporally  abstract  operation. 
It  takes  the  initial  global  MESSAGE  buffer  and  the  initial  collection  of  STICH-IODEs  and  creates 
a  sequence  of  new  global  MESSAGE  buffers  and  SYICH-IODB  collections.  (This  is  a  temporally 
abstract  view  of  the  iterative  computation  performed  on  each  iteration  of  the  simulation  in 
which  MESSAGES  are  delivered  and  SYICH-lODEs  are  stepped.) 

A  Common  Event-Driven  Simulation  Design 

This  section  describes  a  common  event-driven  simulator  design  for  message-passing  systems. 
It  has  data  structures  Asnci-IOOE  and  MESSAGE,  representing  processing  nodes  and  messages, 
respectively.  It  also  has  an  EVBIT  data  structure,  which  represents  the  arrival  of  a  MESSAGE  at 
an  ASTICH-IODE.  Each  ASTICH-IQDE  data  structure  maintains  its  own  local  Clock.  It  also  has 
a  Memory  part,  holding  its  state.  There  is  a  sequence  containing  all  ASYlCH-lODEs,  called 
an  ADDRESS-MAP,  which  maps  each  integer  address  to  an  ASTICH-IODE  (as  in  the  synchronous 
simulation  design).  MESSAGES  typically  have  the  same  parts  as  those  in  the  synchronous  sim¬ 
ulation  design  (Destination-Address,  Type,  Arguments,  Storage-Requirements).  An  ETEIT 
contains  an  Object,  which  is  a  MESSAGE  to  be  handled,  and  a  Time  at  which  the  work  to  be 
done  on  the  object  (i.e.,  handling  a  message)  was  scheduled  (i.e.,  when  the  MESSAGE  arrives 
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at  an  ASTICB-lcmB). 

A  global  agenda,  called  the  EVBIT-QUEUE,  keeps  track  of  EVBITs  that  need  to  be  processed. 
The  agenda  is  implemented  as  a  Priority  Queue,  in  which  the  EVEIT  with  the  earliest  Time 
has  the  highest  priority. 

The  event-driven  simulator  is  given  an  initial  EVBIT,  whose  Object  is  a  starting  NESSiGE 
and  whose  Time  is  the  MESSAGE’S  arrival  time.  This  is  added  to  the  EVEIT-QUEUE.  On  each  step 
of  the  simulation,  the  highest  priority  EVEIT  is  pulled  from  the  EVEIT-QUEUE  and  processed. 
Processing  an  EVEIT  means  simulating  the  handling  of  the  MESSAGE  in  the  EVEIT’s  Object 
part.  The  simulated  message  handling  is  done  in  the  context  of  the  ASTICH-IODE  that 
represents  the  real  node  that  is  the  destination  of  the  mess^e.  This  is  looked  up  using 
the  Destination- Address  part  of  MESSAGE  as  an  index  into  the  sequence  ADDRESS-MAP.  (As  we 
mentioned  earlier,  the  portion  of  the  simulator  that  simulates  a  processing  node’s  message 
handling  actions  varies.  Below,  we  describe  an  initial  set  of  cliches  that  may  be  used. 
However,  this  portion  of  the  simulator  is  not  guaranteed  to  always  be  cliched.) 

When  an  EVEIT  is  processed,  the  Clock  of  the  destination  ASTICI-IODB  for  its  MESSAGE 
Object  is  updated:  the  ASYICH-IQDE’s  Clock  becomes  the  maximum  of  its  current  time 
and  the  arrival  time  of  the  MESSAGE  (i.e.,  EVEIT’s  Time).  (The  ASTICI-IODE’s  current  time 
can  be  later  than  the  arrival  time  if  the  simulator  is  Tnimirking  a  real  situation  in  which 
the  real  node  was  busy  when  the  message  arrived.  The  arrival  time  can  be  later  than  an 
ASTICH-IODE’s  current  time  if  in  the  real  situation  bdng  simulated,  the  real  node  is  idle 
when  the  message  arrives.) 

Handling  a  MESSAGE  can  cause  other  MESSAGES  to  be  sent.  These  are  added  to  the 
EVEIT-QUEUE.  The  event-driven  simulation  ends  when  the  EVEIT-QUEUE  is  empty. 

An  important  characteristic  of  this  algorithm  is  that  the  MESSAGES  are  handled  non-pre- 
emptively,  which  means  that  once  an  ASTICH-IODE  starts  to  handle  a  MESSAGE,  it  will  not  be 
interrupted,  e.g.,  by  receiving  another  MESSAGE. 

Another  property  of  the  algorithm  is  that  at  each  step,  the  globally  earliest  unprocessed 
MESSAGE  received  so  far  is  chosen  to  be  handled.  Since  the  EVEIT  pulled  from  the  EVEIT-QUEUE 
is  always  the  one  with  the  earliest  Time,  and  since  Time  is  the  arrival  time  of  the  MESSAGE 
in  the  EVEIT’s  Object  part,  the  MESSAGE  chosen  to  be  handled  next  is  always  the  one  with 
the  earliest  arrival  time  of  the  MESSAGES  that  have  not  yet  been  handled. 

These  two  properties  ensure  that  once  a  MESSAGE  is  chosen  for  handling,  no  MESSAGES 
will  subsequently  be  generated  that  have  an  arrival  time  earlier  than  the  MESSAGE  chosen. 
In  other  words,  MESSAGES  are  handled  in  the  order  they  arrive.  So  the  simulator  models  the 
invariant  obeyed  by  the  real  machine:  messages  to  the  saune  node  are  handled  in  the  order 
in  which  they  are  received. 

Figure  2-3  shows  the  structure  of  the  portion  of  the  cliche  library  that  contains  the 
event-driven  simulation  clich4  and  the  clich48  it  is  built  upon.  (For  data  clich4s,  refer  to 
Figure  2-2.) 
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Node  Action  Simulation  Cliches 

The  two  simulators  for  message-passing  parallel  systems  contain  a  component  that  simulates 
some  or  all  of  the  actions  that  a  real  processing  node  takes  when  handling  a  message. 
Which  actions  are  simulated  depends  on  the  behavior  of  interest  for  the  simulation.  We 
have  begun  to  collect  some  cliches  that  occur  in  simulators  that  model  message  handler 
lookup  and  execution  on  a  node.  These  cliches  are  found  in  the  broader  domain  of  program 
execution  in  general,  and  the  domain  of  program  interpretation  (or  evaluation)  in  particular 
[1].  Figure  2-4  shows  the  structure  of  this  portion  of  the  library. 

The  cliches  we  have  collected  so  far  are  those  for  the  following. 

•  Looking  up  a  handler  based  on  a  MESSAGE’S  Type,  which  is  typically  an  Associative- 
Set'Lookup  or  Property-List-Lookup,  depending  on  how  the  handlers  are  stored. 

•  Loading  the  MESSAGE’S  Arguments  into  the  Memory  part  of  an  ASTICH-IODE  or  STICH- 
lODE  (depending  on  whether  the  simulator  is  event-driven  or  synchronous).  This  in¬ 
volves  looking  up  the  ASYlCH-lODEor  SYICH-IODE  indexed  by  the  MESSAGE’S  Destination- 
Address,  enumerating  the  Arguments,  accumulating  them  in  a  sequence,  and  adding 
the  sequence  to  the  Memory  part  (typically  an  Associative  Set). 

•  Executing  the  handler  on  the  input  data  given  in  the  Arguments.  An  EXECirriOI- 
COITEZT  data  structure  is  used  to  keep  track  of  the  Node  executing  the  handler  (which 
is  an  ASYICH-IQOE  or  STICI-IODE),  the  Status  of  the  execution  (a  Symbol),  Bindings 
of  variable  names  to  Memory  locations  (in  an  Associative  Set),  and  the  Instructions 
being  executed  (which  is  an  Indexed  Sequence:  a  data  structure  with  two  parts:  a  Base 
sequence  of  IISTRUCTIOIs  and  an  integer  Index  which  acts  as  an  instruction  pointer). 
An  IISTRUCTIOI  consists  of  an  Operator  (symbol),  and  a  set  of  Arguments  (typically 
in  a  list  or  an  adjustable-length  sequence),  which  may  be  other  IISTRUCTIOIs. 

The  handler  execution  involves  iteratively  fetching  the  next  instruction  to  be  executed 
using  the  current  value  of  the  instruction  pointer.  A  standard  Lisp  EVALUATE/APPLT 
recursion  is  then  used  to  interpret  the  IISTRUCTIOI  with  respect  to  the  current  values 
of  the  variable  names  stored  in  Memory.  The  Operator  part  of  the  IISTRUCTIOI  is  used 
to  look  up  a  Common  Lisp  function  for  simulating  the  actions  of  the  processing  node  in 
applying  that  operator  type  to  arguments.  The  EVALUATE/APPLT  recursion  “evaluates” 
an  IISTRUCTIOI  by  iterating  through  its  Arguments,  recursively  evaluating  each  one, 
and  then  applying  the  function  associated  with  the  IlSTRUCTIOI’s  Operator  to  the 
results. 

We  have  made  a  first  attempt  at  capturing  the  knowledge  needed  to  recognize  program 
execution  cliches.  Our  experiences  in  encoding  these  cliches  in  the  graph  grammar  helped 
us  to  understand  both  the  strengths  and  weaknesses  of  the  formalism  for  expressing  certain 
types  of  programming  ideas.  This  is  discussed  further  in  Chapter  5. 
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Figure  2-4:  Node  action  simulation  cliches. 
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2.1.4  The  General-Purpose  Cliches 

Figure  2-5  gives  an  abstract  picture  of  the  relationships  between  the  groups  of  general- 
purpose  cliches  that  are  contained  in  the  library.  Each  box  represents  a  set  of  algo¬ 
rithmic  cliches  that  represent  either  operations  on  some  aggregate  data  structure  cliche 
(e.g.,  Priority-Queue)  or  basic  iteration  or  computational  cliches  (e.g.,  Sum,  Sequence- 
Enumeration,  Absolute- Value).  Each  box  contains  the  names  of  some  of  the  cliches  con¬ 
tained  in  the  group  it  represents. 

The  arrows  between  the  boxes  indicate  that  the  cliches  in  the  source  group  use  the 
cliches  in  the  sink  group  as  components,  or  the  cliches  in  the  source  group  are  abstractions 
of  those  in  the  sink  group.  For  example,  the  arrow  from  FIFO  to  Circiilar-Indexed-Sequence 
(CIS)  indicates  that  cliched  operations  on  FIFOs  can  be  implemented  as  cliched  operations 
on  CISs.  The  arrow  from  CIS  to  Basic-Iteration-Cliches  indicates  that  the  operations  of 
manipulating  a  CIS  use  basic  iteration  cliches  as  components  (e.g.,  the  operation  of  enumer¬ 
ating  a  CIS  uses  a  Bounded-Count  operation  as  a  component,  which  generates  a  sequence 
of  integers  within  some  interval). 

The  cliche  library  does  not  contain  all  existing  algorithmic  cliches  that  operate  on  the 
data  structures  mentioned  in  Figure  2-5.  We  captured  a  fair  number,  but  due  to  time 
limitations,  we  could  not  collect  a  complete  set. 


2.2  Real-World  Programs 

In  studying  program  recognition,  we  focused  on  two  programs  which  were  written  in  Com¬ 
mon  Lisp  by  researchers  in  a  parallel  architecture  group  at  MIT.  The  programs  sequentially 
simulate  the  parallel  execution  of  programs  by  a  hne-graln  message-passing  parallel  machine 
(which  is  described  in  [26]). 

One  program,  called  PiSia,  simulates  the  parallel  execution  of  programs  in  terms  of  the 
operations  of  a  “parallel  interface”  (Pi)  [146,  147].  (A  parallel  architecture  interface  sepa¬ 
rates  parallel  programming  model  issues  from  machine  hardware  issues,  in  a  way  analogous 
to  the  von  Neumann  interface  for  sequential  computers.  For  more  details,  see  [146].)  It  uses 
the  event-driven  algorithm  and  the  program  interpretation  cliches  that  are  in  our  library. 

The  other  simulator  simulates  the  parallel  execution  of  programs  written  in  a  language 
called  “Concurrent  SmallTalk”  [25].  We  will  refer  to  this  simulator  as  CST.  It  uses  the 
synchronous  simulation  design. 

The  CST  simulator  program  is  actually  a  module  in  a  larger  program  which  provides  a 
programming  environment  for  compiling,  simulating,  tracing,  and  gathering  and  displaying 
statistics  on  the  execution  of  Concurrent  SmallTalk  code.  Functions  that  call  the  simulator 
are  not  analyzed,  neither  are  the  metering,  tracing,  and  plotting  functions  that  it  calls. 

There  are  a  few  important  points  about  the  example  simulators  that  are  relevant  to  our 
study  of  recognition.  One  is  that  currently,  GRASPR  is  unable  to  recognize  cliches  in  programs 
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that  contain  operations  that  destructively  modify  mutable  data  structures.  Our  plan  is  to 
study  the  recognition  of  aggregate  data  structures,  independent  of  issues  concerning  side  ef¬ 
fects  to  them,  and  then  attempt  to  tackle  the  problems  of  mutable  data  structures  later.  So, 
we  manually  converted  the  example  programs  to  programs  that  contain  only  non-destructive 
versions  of  the  data  structure  operations.  For  example,  we  replaced  destructive  alterations 
to  data  structures  with  changes  to  copies  of  the  data  structures  We  adso  propagated  these 
changes  to  the  data  structures  that  pointed  to  the  altered  data  structure,  and  so  on.  We 
essentially  routed  the  dataflow  by  hand  so  that  aU  aliasing  was  taken  into  account.  (Section 
7.2.4  gives  more  details.  Appendix  B  contains  the  original  versions  of  the  two  simulator 
programs,  followed  by  their  functional  translations.) 

In  doing  the  translation,  we  found  that  many  of  the  translation  steps  are  automatable. 
For  certain  types  of  side  effects,  it  may  be  possible  to  automatically  uncover  straightforward 
types  of  aliasing  patterns  and  replace  them  with  their  non-destructive  counterparts.  The 
insights  we  gained  should  help  us  extend  GRASPR  in  the  future  to  deal  with  side  effects  to 
mutable  objects,  as  discussed  in  Section  7.2.4. 

AU  of  the  cliches  in  our  current  library  are  “pure”  in  that  they  include  no  destructive 
operations  (such  as  RPLACD,  RPLICA,  or  SETF  in  Common  Lisp). 

Another  important  point  concerns  how  the  programs  simulate  message  handling.  We 
mentioned  earlier  that  we  have  only  begun  to  encode  the  cliches  found  in  code  that  is 
responsible  for  simulating  a  processing  node’s  action  of  handling  a  message.  We  have 
experimented  with  recognizing  these  cliches  in  PiSia,  which  contains  them.  However,  we 
would  also  like  to  explore  the  issues  that  arise  when  an  integral  part  of  an  algorithmic 
cliche  can  be  fUled  with  unfamUiar,  perhaps  loosely  constrained  code.  The  CST  program 
allows  us  to  explore  these  difficulties  because  it  contains  code  for  simulating  a  node’s  action 
that  is  not  cliched  (at  least  with  respect  to  our  current  library  of  cliches).  Details  of  these 
difficulties  and  suggestions  for  solving  them  are  given  in  Sections  4.1.4  and  5.2.3. 

Our  final  point  is  that  even  though  PiSia  contains  cliched  node  action  simulation  code, 
problems  stiU  arise  in  expressing  and  recognizing  certain  cliches.  This  is  because  part  of 
the  information  about  how  to  simulate  a  node’s  aM:tion  is  given  as  input,  rather  than  being 
statically  contained  in  the  program.  In  particular,  PiSia  takes  a  set  of  message  handlers  as 
input.  Each  handler  provides  a  set  of  instructions  to  be  executed  when  handling  a  certain 
type  of  message.  For  example.  Figure  2-6  gives  a  handler  for  a  Factorial  message,  which 
iteratively  computes  the  factorial  of  a  single  argument  (l).  (The  Z  is  a  local  variable.)  The 
instructions  in  the  handlers  are  written  in  a  langua^^e  of  Machine  Operations  (e.g.,  Tinas, 
Branch-Zaro).  Each  Machine  Operation  has  a  Common  Lisp  function  associated  with  :t 
that  specifies  how  to  simulate  the  actions  of  the  processing  node  in  executing  that  machine 
operation.  They  are  defined  in  terms  of  simulator  functions.  For  example,  Figure  2-7  shows 
the  functions  that  are  associated  with  the  operations  Tinas  and  Branch-Zaro. 

Like  the  set  of  handlers,  the  definitions  of  Machine  Operations  are  inputs  to  PiSin.  This 
means  they  are  not  available  for  analysis  or  recognition.  The  problem  that  this  poses  is 


(dafina-handler  Factorial  (I)  (X) 

(print-usar  "'tnuming  sinpla  loop  test'*/,") 

(arita  (sail)  X  1) 

Loop 

(branch-zaro  (raad  (sail)  I)  Dona) 

(arita  (sail)  X  (tinas  (raad  (sail)  X)  (raad  (sell)  I))) 

(arita  (sail)  I  (minus  (read  (sail)  I)  1)) 

(branch-zero  0  Loop) 

Dona 

(print-user  "'Irtha  ansaar  is  'd'y.'*  (read  (sail)  X)) 

(destroy-sagmant  (sail))) 

Figure  2-6:  A  message  handler  for  Factorial. 

that  the  data  and  control  flow  of  the  entire  PiSim  program  cannot  be  statically  computed. 
It  depends  on  the  input  for  a  particular  simulation.  The  implication  of  this  is  that  we  do 
not  have  complete  knowledge  about  who  calls  the  simulator  functions  or  how  their  inputs 
and  outputs  are  connected.  The  problems  we  have  encountered  as  a  result  are  discussed  in 
Section  5.2. 

Choice  of  Programs:  Breaking  Out  of  the  Toy  Program  Rut 

In  choosing  programs  to  use  in  our  study  of  recognition,  our  goal  was  to  break  out  of  the  rut 
of  automating  the  recognition  of  “toy”  programs,  in  which  most  earlier  recognition  research 
has  been  caught.  Both  simulator  programs  (PiSim  and  CST)  do  this.  Their  sizes  fall  in  the 
500  to  1000  line  range,  rather  than  being  on  the  order  of  tens  of  lines,  which  is  the  typical 
size  of  programs  dealt  with  in  previous  recognition  research. 

Program  length  is  only  an  approximate  indicator  of  the  potential  difficulty  of  recognizing 
a  program.  In  addition  to  choosing  larger  programs,  we  have  chosen  programs  not  written 
by  us  (the  designers  of  the  recognition  system).  The  simulator  programs  are  not  contrived 
examples.  They  were  written,  without  bias,  to  solve  a  particular  real-world  problem. 

A  key  advantage  of  this  is  that  it  provides  challenges  to  the  recognition  approach  that 
might  not  be  anticipated  by  us,  as  developers  of  it.  Even  though  we  may  need  to  change  or 
simplify  the  original  program  to  allow  recognition  to  occur,  we  are  aware  of  the  limitation  of 
our  approach  that  requires  this.  We  also  are  aware  of  the  type  of  transformation  that  should 
be  ma.de  or  the  advice  that  should  be  given  to  help  deal  with  the  shortcoming.  (Section 
5.2  discusses  the  limitations  observed  and  Section  5.2.5  summarizes  changes  made  to  the 
original  programs  to  yield  the  programs  that  GRiSPR  recognizes.) 

Additionally,  the  programs  indicate  which  characteristics  of  programs  are  typical.  This 
helps  us  in  analyzing  our  recognition  technique.  For  example,  recognition  by  graph  parsing 
can  be  expensive  if  there  are  excessive  amounts  of  redundant  computation,  which  causes 
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(Dafi&a-Oparation  Tiaas  (Active-Task  X  Y) 

(mltiple-valua-bind  (le«-Tiae  Task-lode  les-Task) 
(IncreBent-Tiae-Of  Active-Task  1) 

(values  (*  X  Y)  las-Task))) 

(Def ine-Operation  Branck-Zero  (Active-Task  Test-Variable  Label) 
(aultiple-value-bind  (lev-Tiae  Task-lode  lev-Task) 
(Increaent-Tiae-Of  Active-Task  1) 

(if  (zerop  Test-Variable) 

(values  Label 

(Make-Task  :Haiidler  (Task-Handler  lev-Task) 
:Iode  (Task-lode  lev-Task) 
rSegaent  (Task-Segaent  lev-Task) 

:IP  Label 

: Status  (Task-Status  lev-Task))) 
(values  nil  lev-Task)))) 


Figure  2-7:  The  definition  of  two  Machine  Operations. 

ambiguity.  However,  this  characteristic  is  rare  in  the  example  simulator  programs.  Knowing 
which  characteristics  are  typical  or  rare  in  real-world  programs  helps  us  determine  which 
factors  influence  the  practicality  of  our  approach. 

Another  aspect  of  the  simulator  programs  which  distinguishes  them  from  the  “toy”  pro¬ 
grams  studied  previously  is  that  they  contain  domain-specific  cliches.  These  go  beyond 
general-purpose  cliches,  such  as  operations  on  queues,  stacks,  and  hash  tables,  which  have 
been  the  focus  of  previous  recognition  research.  The  programs  contain  common  simulation 
algorithms  and  data  structures.  By  recognizing  these  cliches,  6RASPR  provides  more  useful 
program  understanding  capabilities  than  if  it  recognized  the  general-purpose  cliches  alone. 
This  allows  us  to  explore  the  expressiveness  of  the  graph  grammar  formalism  as  a  repre¬ 
sentation  for  domain-specific  cliches.  (On  the  other  hand,  the  current  cliche  library  has 
been  acquired  with  the  example  programs  in  mind.  More  empirical  studies  are  needed  to 
evaluate  the  ability  of  the  existing  system  to  recognize  new  programs  with  the  same  library 
and  to  determine  how  much  the  library  must  change  to  recognize  them.) 

The  simulator  programs  also  contain  a  fair  amount  of  unfamiliar  code  mixed  in  with 
clich4d  computational  structures.  In  experimenting  with  them,  we  test  GRASPR’s  abilities 
to  perform  partial  recognition,  which  is  required  in  dealing  with  any  realistic,  non-trivial 
program. 
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2.3  Recognition  Examples 

Besides  identifying  the  knowledge  needed  to  understand  and  construct  programs,  it  is  im¬ 
portant  to  capture  this  knowledge  in  such  a  way  that  it  can  be  applied  to  a  broad  range  of 
programs.  In  automating  program  recognition,  our  goal  is  to  codify  programming  cliches 
at  a  level  of  abstraction  that  allows  us  to  recognize  them  in  programs  that  vary  widely  in 
such  details  as  syntactic  constructs  used,  programming  language  chosen,  data  structure  and 
subroutine  decomposition,  and  implementational  choices.  In  addition,  we  provide  recogni¬ 
tion  techniques  that  are  robust  under  other  types  of  variation,  such  as  variation  due  to 
function-sharing  optimizations  and  unfamiliar  code. 

This  section  gives  examples  of  the  recognition  capabilities  of  GRASPR.  This  serves  to 
demonstrate  what  GRASPR  can  do  in  terms  of  the  classes  of  variation  it  can  tolerate.  It  also 
provides  motivating  examples  of  the  goals  we  have  for  our  representational  formalism  and 
recognition  technique. 

2.3.1  Common  Program  Variations 

Program  recognition  is  difficult  due  to  the  wide  range  of  possible  variations  among  programs. 
An  instance  of  a  cliche  may  appear  in  a  variety  of  forms.  The  following  is  a  list  of  some  of 
the  common  types  of  variation  found  in  programs.  (This  does  not  provide  a  complete  list 
of  the  variations  we  encountered  in  our  empirical  recognition  studies  with  PiSia  and  CSX. 
Chapter  5  discusses  more  variations,  both  those  tolerated  and  not  tolerated  by  our  current 
system.) 

•  Syntactic  variation  in  control  and  binding  constructs.  There  are  typically  many  ways 
to  achieve  the  same  net  flow  of  data  and  control.  Variable,  function,  data  structure, 
and  part  names  vary  widely.  Also,  syntax  varies  over  programming  languages. 

•  Implementation  variation.  A  given  abstraction  can  often  be  implemented  by  a  set  of 
different  concrete  algorithms  and  data  structures. 

•  Delocalization.  Parts  of  a  cliche  are  sometimes  widely  scattered  throughout  the  text 
of  a  program,  rather  than  being  contiguous. 

•  Unrecognizable  code.  Not  all  programs  are  constructed  completely  of  cliches.  Recog¬ 
nition  must  be  able  to  ignore  an  unpredictable  amount  of  unrecognizable  code. 

•  Variation  in  the  organization  of  components.  Programs  can  be  decomposed  into  sub¬ 
routines  in  a  variety  of  ways.  Also,  data  structures  can  aggregate  pieces  of  data  in  a 
multitude  of  different  nested  organizations. 

•  Redundancy.  Programs  may  vary  in  how  much  computation  is  repeated  in  the  same 
instance  of  a  cliche.  For  example,  when  the  result  of  some  inexpensive  computation 
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is  needed  more  than  once,  the  program  may  simply  recompute  the  value  each  time  it 
is  needed  rather  than  caching  the  result  in  a  temporary  variable. 

•  Optimizations.  A  great  deal  of  variation  occurs  between  optimized  and  unoptimized 
programs  even  though  they  may  contain  the  same  abstract  cliche.  A  common  form 
of  optimization  introduces  function-sharing  in  which  the  implementations  of  two  or 
more  distinct  abstract  structures  are  merged. 

2.3.2  Examples  of  Capabilities 

GRASPR  is  able  to  recognize  both  CST  and  PiSia  as  sequential  simulators  of  message-passing 
parallel  systems.  It  recognizes  the  synchronous  simulation  design  in  CST  and  the  event-driven 
simulation  design  in  PiSia.  It  also  recognizes  the  message-passing  program  execution  cliches 
in  the  portion  of  PiSia’s  code  that  simulates  handling  messages. 

The  primary  output  of  GRASPR  is  a  forest  of  design  trees.  A  design  tree  indicates  the 
cliches  found  in  the  program  and  how  they  are  related  to  each  other.  Figure  2-8  shows  a 
portion  of  the  design  tree  produced  in  recognizing  PiSia.  Subtrees  that  are  not  shown  are 
collapsed  into  small  triangles  below  a  cliche  name.  The  dashed  lines  at  the  tree’s  fringe  are 
links  to  primitive  operations  in  the  source  code,  which  indicate  the  location  of  a  particular 
cliche  in  the  code.  The  drawing  of  the  design  tree  is  a  simplified  version  of  the  actual 
description  produced  by  GRASPR.  The  description  is  simplified  (for  presentation  purposes) 
in  that  only  operations  are  specified  in  the  leaves  of  the  tree,  while  the  actual  description 
includes  information  about  the  data  involved  in  each  cliche  instance.  In  general,  GRASPR 
may  produce  several  design  trees,  representing  recognition  of  multiple,  perhaps  overlapping, 
cliches  in  the  code. 

(The  design  trees  are  graph  grammar  derivation  trees,  which  are  described  in  Section 
3.2.2.  In  general,  they  may  be  graphs  in  that  a  recognized  cliche  may  be  a  component  or 
implementation  of  two  or  more  higher-level  cliches.) 

A  secondary  way  to  view  the  output  of  GRASPR  is  provided  by  a  tool,  called  “Para- 
phraser,”  which  takes  the  design  trees  produced  during  recognition  and  generates  textual 
documentation  based  on  them.  Paraphraser  knits  together  schematized  textual  fragments 
associated  with  the  recognized  cliches,  filling  in  slots  with  identifiers  taken  from  the  source 
code  (e.g.,  *BVB1T-QUEUE*).  It  bases  the  structure  of  the  text  on  the  relationships  between 
the  cliches. 

Figure  2-9  shows  some  of  the  documentation  generated  for  the  design  tree  shown  in  Fig¬ 
ure  2-8.  The  documentation,  although  stilted,  does  describe  the  important  design  decisions 
in  the  program  and  can  help  a  programmer  locate  relevant  objects  in  the  code  (via  the 
identifiers). 

One  potential  benefit  of  automated  program  recognition  is  to  use  such  automatically 
produced  documentation  to  maintain  poorly  documented  or  undocumented  programs.  Au¬ 
tomatically  produced  documentation  can  be  updated  whenever  the  source  code  changes, 
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Figure  2-8:  Design  tree  for  PiSia. 


PISIM  saquantially  siaalatai  a  parallel  Beesaga-passing  systaa. 

It  is  iaplaaentad  as  em  Esant-OriTan  Sisnilation. 

1:  Evant-Drivan  Siaulation  asynchronously  simlatas  a  collection  of 
processing  nodes  handling  aassagas,  using  an  evant-drivan  algoritha.  An 
aTent-quaua  *EVEIT-qUEUE*  of  events  is  Baintainad.  To  start,  an  initial 
event  EVEIT  is  inserted  in  the  avant-quaua.  On  each  step,  an  event  is 
pulled  off  and  processed,  which  Bay  create  nav  events  to  be  added  to  the 
event-queue.  The  asynchronous  nodes  (which  represent  processing  nodes) 
are  collected  in  an  address-stap,  called  elODES*. 

Event-Driven  SiBulation  is  coBposed  of  a  Priority-Queue  Insert,  a  Co-Earliest 
Event-Driven  SiBulation  Finished  and  a  Generate  Event  Queues  and  lodes. 

2:  Priority-Queue  Insert  inserts  EVEIT  in  the  priority  queue 
*EVEIT-QUEUEe .  An  elaBent’s  priority  P  is  higher  than  another’s  Q, 
if  P  <  Q.  If  an  elsBant  already  exists  in  the  priority  queue  with 
the  sane  priority,  than  the  new  elaBent  is  inserted  into  the  queue 
after  the  existing  elsBant. 

Priority-Queue  Insert  is  inplsBented  as  an  Ordered  Associativa  List  Insert. 
3:  Ordered  Associative  List  Insert  inserts  EVEIT  in  the 
ordered  associativa  list  *EVEIT-QUEUE* . . . 

2:  Co-Earliest  Event-Driven  SiBulation  Finished  takes  a  sequence  of 
event-queues  and  a  sequence  of  address-Baps  and  returns  the  address-Bap 
in  the  sequence  of  address-naps  that  corresponds  to  the  first  eiq>ty 
event-queue  in  the  sequence  of  event-queues. 

Co-Earliest  Event-Driven  Sinulation  Finished  tenporally  abstracts 
Co-Iterative  Event-Driven  Sinulation  Finished. 

3:  Co-Iterative  Event-Driven  Siaulation  Finished  teminates 
the  siaulation  when  the  current  event-queue  ( *EVKIT-QUEUE* ) 
is  eapty,  returning  the  current  value  of  the  address-aap  (*IODES*). 

The  event-queue  is  iapleaented  as  a  Priority  Queue. 

The  Event-Driven  Sinulation  Finished  Test  is  iapleaented  as  a 
Priority  Queue  Eapty. 

4;  Priority  Queue  Eapty  tests  whether  the  priority  queue 
eEVEIT-QUEUE*  is  eapty _ 

2:  Generate  Event  Queues  and  lodes  generates  event-queues  and  address- 
naps  by  repeatedly  dequeuing  the  current  event-queue  and  processing 
the  event  dequeued.  Processing  an  event  causes  new  events  to  be  added 
to  the  event-queue  and  a  new  address-aap  to  be  created.  The  initial 
event-queue  is  *EVE1T-QUEUE*  and  the  initial  address-aap  is  *IODES*... 
Generate  Event  Queues  and  lodes  tenporally  abstracts  Dequeue  and 
Process  Generation. . . . 


Figure  2-9;  Some  of  the  documentation  generated  for  PiSia. 


solving  the  pernicious  problem  of  misleading,  out-of-date  documentation. 

The  current  implementation  of  Paraphraser  is  heuristic  and  fragile.  Documentation 
generation  is  not  a  primary  focus  of  this  research.  The  problem  of  applying  recognition  to 
program  documentation  needs  further  study,  perhaps  borrowing  techniques  from  natural 
language  generation. 

Besides  documentation,  there  are  a  variety  of  ways  to  present  the  results  of  recognition, 
depending  on  how  the  results  will  be  used.  Future  work  is  needed  to  find  the  presentation 
appropriate  for  effective  interaction  with  people  and  other  automated  tools. 

Syntactic  Variation 

The  design  tree  and  documentation  shown  in  Figures  2-8  and  2-9  were  produced  by 
6RASPR  in  recognizing  PiSia.  The  top-level  portion  of  PiSia  is  shown  in  Figure  2-10.  (The 
source  code  for  data  structure  definitions  and  some  subroutines  are  not  shown.)  Inject  is 
the  top-level  function  which  starts  the  PiSia  simulator.  It  takes  an  initial  start  message 
type  and  the  message’s  arguments.  After  some  initialization,  it  creates  a  Message  data 
structure,  based  on  information  about  storage  requirements  computed  from  the  Handler 
that  is  associated  with  the  message  type.  It  randomly  generates  a  destination  address  for 
the  message  and  computes  the  message’s  arrival  time  from  the  destination  lode’s  current 
time.  Once  the  Message  is  created,  an  Event  is  constructed,  whose  Object  part  is  the  Message 
and  whose  Time  is  the  arrival  time.  The  Event  is  placed  on  the  event-queue  *Bvent-Queue* 
and  Execute-Events  is  run  to  iteratively  extract  and  execute  the  highest  priority  event  on 
the  event-queue. 

Given  a  syntactic  variation  of  this  code,  such  as  the  code  in  Figure  2-11,  GRiSPR  is  able 
to  recognize  the  same  cliches  to  produce  the  same  design  tree  and  documentation  (mod¬ 
ulo  identifiers).  Recognition  is  robust  under  variations  in  variable  names  (Length  versus 
Menory-Iseded),  binding  and  control  constructs  (cond  versus  if),  and  names  of  data  struc¬ 
tures  and  their  psirts  (Message  versus  Msg  and  Message-Destination  versus  Msg-Dest-Addr). 
Start-PiSia  also  differs  from  Inject  in  the  ordering  of  computations  in  the  let  binding 
clauses.  It  routes  dataflow  differently,  using  fewer  local  variables.  It  also  passes  the  event 
queue  around  explicitly,  rather  than  maintaining  a  global  variable.  Recognition  robustness 
is  achieved  as  a  result  of  the  representation  shift  performed  by  6RASPR  which  translates  both 
programs  into  the  same  graphical  representation.  In  this  representation,  syntactic  det^ 
are  suppressed. 

Organization  of  Components 

The  representation  used  by  GRASPR  also  suppresses  details  of  how  programs  are  decom¬ 
posed  into  subroutines  and  how  aggregate  data  structures  are  organized.  For  example,  the 
code  in  Figure  2-12  differs  from  the  original  PiSia  code  shown  in  Figure  2-10  in  structural 
organization.  It  bundles  up  the  initialization  and  storage  requirement  computations  into 
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(dvfvur  *Bv«at>qu«u«*  nil  "this  is  the  global  ossnt-qnoas") 
(dafsar  *Iodss*  nil  "this  is  tho  nods  array") 

(dsfstmct  Nsssags 
(Dsstination  nil) 

(Length  0) 

(Type  nil) 

(Argnnents  nil)) 

(delstruct  Event 
(Tine  0) 

(Object  nil)) 

(delnn  Inject  (Type  treat  Argnaents) 

(Make-lodes) 

(Claar-lodas) 

(Clear-Event-Queue)  ; ;  resets  eEvent-Qnene*  to  IIL 
(let*  ((Handler  (Get-Handler  Type)) 

(Length  (■•■  ( Handler- Arity  Handler) 

(Handler-Inaiber-Ot-Locals  Handler) 

2)) 

(Destination  (randon  (Inaber-Of-Iodes))) 

(Arrival-Tine  (lode-Tiae  (Translate-lode  Destination))) 
(Message  (Make-Message  :Destination  Destination 

: Length  Length 
:Type  Type 

: Argnnents  Argnnents)) 

(Event  (Make-Event  :Tine  Arrival-Tine 
:0bject  Message))) 

(Enqnene-Event  Event) 

(Ezecnte-Events ) ) ) 

(delnn  Enqnene-Event  (lev-Event) 

(if  (or  (nnll  *Evont-qneno*) 

(<  (Event-Tine  lev-Event) 

(Event-Tine  (first  *Event-qnene*)))) 

(setq  *Evsnt-qnene* 

(cons  lev-Event  *Event-Qneue*)) 

(setq  *Event-Qnene* 

(Insert-Event  lev-Event  *Eveat-quene*)))) 

(defnn  Ezecnte-Events  () 

(cond  ((nnll  *Event-Qneue*) 

*lodas*) 

(t  (Ezecnte-lazt-Evant) 

(Ezecnte-Events) ) ) ) 

Figure  2-10:  Top-level  portion  of  PiSin  code. 
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(dcfvar  *P-lod0S*  ail  "collection  of  nodes”) 

(defstruct  Msg 
(Dest-iddr  nil) 

(Storage-Length  0) 

(Type  nil) 

(Args  nil)) 

(defstruct  Event 
(Tine  0) 

(Object  nil)) 

(defun  Start-PiSia  (Start-Msg-Type  Args) 

(Make-lodes) 

(Clear-lodes) 

(let*  ((Address  (randoa  (luaber-Of-lodes))) 

(Nsg-Haadler  (Get-Handler  Start-Nsg-Type)) 

(Neaory-leeded  (*  (Handler-Arity  Nag-Handler) 

(Handler-luaber-Of-Locals  Msg-Handler) 

2)) 


(Pending-Events 

(Enqueue-Event 

(Make-Event  :Tijie  (lode-Tiae  (Tfanslate-lode  Address)) 

: Object  (Nake-Msg  :Dest-Addr  Address 

:  Storage-Length  Neaory-leeded 
:Type  Start-Nsg-Type 
:Args  Args)) 

nil))) 

(Execute-Events  Pending-Events))} 

(defun  Enqueue-Event  (lev-Event  Evuit-Queue) 

(if  (or  (null  Event-Queue) 

(<  (Event-Tiae  lev-Event) 

(Event-Tiae  (first  Event-Queue)))) 

(setq  Event-Queue 

(cons  lev-Event  Event-Queue)) 

(setq  Event-Queue 

(Insert-Event  lev-Event  Event-Queue))) 

Event-Queue) 

(defun  Execute-Events  (Pending-Events) 

(if  (null  Pending-Events) 

*P-lodes* 


(Execute-Events 

(Execute-lext-Event  Pending-Events)))) 


Figure  2-11:  A  syntactic  variation  of  the  portion  of  PiSia  shown  in  Figure  2-10. 
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(dafvar  *Naas*ga-Qa«iia*  nil  "thin  in  thn  global  nannagn  qnnnn") 
(doftar  *lodan*  ail  "thin  in  tha  aoda  axray”) 

(daiatrnct  Nag 
(Oaatinatioa  nil) 

(Arrital-Tiaa  0) 

(Data  nil)) 

(dafntrnct  landlar-Data 
(Typa  nil) 

(Laagth  0) 

(Arguiantn  nil)) 

(dafnn  Initial iza-Siatolator  () 

(Naka-Iodan) 

(Claar-lodaa) 

(Claar-Maanaga-Quaua))  ;;  raaatn  *Naaaaga-Qnana*  to  NIL 
(dafnn  Coapnta-Storaga-Rcpita  (Typa) 

(lat  ((Handlar  (6at-Handlar  Typa))) 

(.*  (landlar-Arity  Handlar) 

(Handlar-Inabar-Of-Loealn  Handlar) 

2))) 

(dafnn  Injact  (Typa  hrant  irgnaantn) 

( Init ializa-Sinnlator ) 

(lat*  ((Langth  (Coapnta-Storaga-R^itn  Typa)) 

(Dantiaation  (randoa  (Hnabar-Of-Iodaa))) 

(Arrival-Tiaa  (Hoda-Tiaa  (Traanlata-loda  Dantiaation))) 
(Handlar-Data  (Naka-Handlar'-Data  :Typa  Typa 

: Langth  Langth 
tArgnaantn  Argnaantn)) 
(Nanaaga  (Naka-Nag  :Dnntination  Dantiaation 
:ArriTal~TiBa  ArriTal-Tiaa 
;Data  Haadlar-Data) ) ) 
(Baqnana-Nannaga  Nanaaga) 

(Procann-Nannagaa) ) ) 

(dafnn  Bnqnana-Naanaga  (Nanaaga) 

(if  (or  (nnll  *Noaaaga-Qnana«) 

(<  (Nng-ArriTal-Tiaa  Nanaaga) 

(Nag-irrital-Tiaa  (firat  *Nanaaga-Qnann*)))) 

(natq  «Nanaaga-Qaaua« 

(conn  Nanaaga  *Nanaaga-qnana*)) 

(natq  *Nannaga-Qnana* 

(Innart-Nannaga  Nanaaga  oNaanaga-Onana*)))) 

(dafnn  Procann-Nannagaa  () 

(cond  ((nnll  *Nannaga-Quana*)  *lodan*) 

(t  (Procaan-Iazt-Nanaaga) 

(Procann-Nannagaa) ) ) ) 


Figure  2-12:  An  organizational  variation  of  the  top-level  portion  of  PiSia. 
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subroutines.  It  also  aggregates  data  differently.  The  original  code  defines  an  Event  data 
structure  with  two  parts:  an  Object  and  a  Time.  The  Object  part  is  filled  by  a  Message 
data  structure,  which  has  the  parts  Destination,  Length,  Type,  and  Arguments.  Pending 
Events  (containing  Messages  to  be  handled)  are  queued  in  an  *Event -Queue*. 

In  the  variation  of  this  code  shown  in  Figure  2-12,  there  is  no  Event  data  structure. 
Instead  Msg  data  structures  are  placed  directly  in  an  event-queue,  called  vMessage-Queue*. 
Each  Msg  contains  all  the  data  that  is  in  a  Message  in  the  original  code  and  additionally 
has  an  Arrival-Time  part,  which  plays  the  role  of  the  Time  part  of  Events  in  the  original 
code.  Some  of  the  data  aggregated  in  Mag  is  aggregated  further  into  a  sub-structure,  called 
Handler-Data.  This  structure  contains  the  parts  Length,  Type,  and  Arguments  found  in 
Message  originally  and  it  is  nested  inside  the  Msg  data  structure,  under  the  Data  part. 

Despite  these  differences,  6RASPR  recognizes  the  same  cliches  in  this  code  as  in  the  original 
code  in  Figure  2-10. 

It  is  important  that  recognition  be  robust  under  organizational  variations  because  the 
cliches  in  the  current  library  are  themselves  organized  hierarchically.  It  is  crucial  that  the 
program  need  not  mirror  this  same  organization  for  the  cliches  to  be  recognized  in  it. 

This  is  because  the  library  organization  is  not  necessarily  based  on  the  typical  way 
these  cliches  are  organized  in  programs.  There  are  two  reasons  it  is  not.  One  is  that  there 
is  not  always  exactly  one  “typical”  or  common  decomposition  of  cliches  into  subroutines 
or  nesting  of  aggregate  data  structures.  The  second  is  that  it  may  be  better  to  base  the 
library’s  organization  on  other  criteria  besides  what  is  typical.  For  example,  the  organization 
might  be  chosen  to  emphasize  salient  parts  of  clichds  to  facilitate  recognition  performance 
improvements  or  to  help  choose  the  best  partial  analysis  during  near-miss  recognition. 

On  the  other  hand,  information  about  typical  decompositions  may  provide  valuable 
expectations  about  the  location  of  cliches  in  a  program.  This  can  considerably  narrow 
down  the  search  for  cliches,  as  discussed  in  Section  6.4.1. 

Our  representation  does  not  eliminate  information  about  the  boundaries  of  subroutines 
and  user-defined  data  structures  within  the  program.  It  merely  suppresses  it,  so  that  the  or¬ 
ganizational  variation  does  not  hinder  recognition.  It  places  this  information  in  annotations 
on  the  graphical  representation  of  the  program.  So,  although  in  general  we  do  not  require 
that  a  program’s  function  and  data  structure  organization  match  the  organization  of  the 
cliches  in  our  library,  it  is  possible  to  impose  constraints  on  the  cliches  being  recognized, 
requiring  that  they  occur  within  certain  boundsiries.  These  boundaries  can  be  heuristically 
defined  based  on  information,  such  as  subroutine  or  data  structure  decomposition.  (See 
Section  6.4.1  for  more  details.) 

Delocsdized  Cliches  and  Unfamilisu’  Code 

Programs  are  rarely  constructed  entirely  of  cliches.  Non-trivial  programs  are  usually  a 
mix  of  clich4d  computational  structures  and  unfamiliar  code.  In  addition,  the  cliches  are 
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(dsfim  cat-start  (iait-aag) 

(sand-asg  init-asg) 

(shall-go)) 

(dalun  sand-asg  (asg) 

(satq  astap-quaua* 

(enquaua  *stap-qaaaa*  asg))) 

(dafnn  shell-go  () 

(cond  ((step-dona)  nil) 

(t  (step-nodes) 

(shell-go)))) 

(delun  step-nodes  () 

(vhen  eproflle*  (profile-step))  ? 

(vhen  *log*  (log-step) )  ; ;  ? 

(shen  etrace*  ; :  ? 

(record-traced-selectors  etrace-selectors*))  ;;  ? 

(daliTer-asgs) 

(when  enatar-nassage-qneues*  ; ;  ? 

(racord-aassaga-qnene-data) )  ; ;  ? 

(iteratiTely-step-nodes  0) 

(setq  *atep-nr*  (1+  *step-nr*)))) 

(defnn  iteratiTely-step-nodes  (z) 

(if  (>®  z  (array-total-size  enodes*)) 
nil 

(step-node  z) 

(iteratiTely-step-nodes  (!■•■  z)))) 

(defnn  step-node  (node-nr) 

(let*  ((node  (get-node  node-nr)) 

(q  (noda-quene  node))) 

(if  (qnene-eapty?  q) 
nil 

(anltiple-Talne-bind  (asg  nev-qnene) 

(deqnene  q) 

(setq  node 

(aake-node  : queue  nev-quaue 

: objects  (node-objects  node)  ; ;  ? 

: contents  (node-contazts  node) 

: busy-count  (1*  (node-busy-count  node))  ;;  ? 
:aethod-cache  (noda-aethod-cache  node)))  ;;  ? 
(setq  *nodes*  (copy-replaca-elt  node  node-nr  *nodes*)) 
(anltiple-Talne-bind  (new-nodas  nev-step-qnane) 

(process-asg  asg  *nodes*  *stap-qneue*) 

(setq  *nodas*  nan-nodes 

*step-qnene*  nen-step-qnens)))))) 


Figure  2-13:  Top-level  portion  of  CST.  Question  marks  indicate  unfamiliar  code. 


often  interleaved  with  unfamiliar  computation  as  well  as  with  each  other.  This  means  that 
parts  of  a  cliche  may  be  scattered  throughout  the  text  of  a  program.  Both  of  these  factors 
make  recognition  difficult  not  only  to  automate,  but  also  for  people  to  do  correctly. 

GRASPR  is  able  to  ignore  unfamiliar  code  to  partially  recognize  the  program.  It  also 
addresses  the  difficulty  of  recognizing  delocalized  cliches  by  employing  a  program  represen¬ 
tation  shift  from  source  text  to  flow  graph.  Cliche  parts  that  are  separated  by  unrelated 
expressions  in  the  text  become  neighboring  nodes  in  a  flow  graph. 

For  example,  Figure  2-13  shows  the  top-level  portion  of  the  CST  program,  which  uses  the 
synchronous  simulation  design.  (The  source  code  for  data  structure  definitions  and  some 
subroutines  are  not  shown.)  In  addition  to  the  simulation  algorithm  and  data  structures, 
this  code  contains  calls  to  functions  that  perform  various  metering,  logging,  and  statistics¬ 
gathering  operations.  These  operations  are  not  cliched,  at  least  with  respect  to  our  current 
library.  The  figure  indicates  unfamiliar  portions  of  the  code  with  question  marks.  The 
cliches  in  the  program  are  not  found  in  one  contiguous  section  of  program  text,  but  are 
interrupted  with  unrelated  computations. 

Not  only  are  there  unfamiliar  computations  interleaved  with  the  algorithmic  cliches,  but 
there  are  also  parts  of  data  structures  that  are  not  recognizable  as  part  of  any  data  cliche. 
For  example,  the  data  structure  node  consists  of  a  Queue  part  (which  acts  as  the  local  FIFO 
buffer  in  the  SYICH-IOOE  data  cliche)  and  a  Contexts  part  (which  contains  a  data  structure 
that  has  a  part  corresponding  to  the  Memory  part  of  the  SYICH-IODE).  The  rest  of  the  parts 
of  node  (Objects,  Busy-Count,  and  Method-Cache)  are  novel,  specific  to  this  program.  They 
are  used  for  gathering  statistics  and  simulating  the  action  of  handling  a  message. 

Despite  the  delocalization  of  the  cliches  and  the  unfamiliar  code,  GRASPR  is  able  to 
recognize  cliched  parts  of  this  program.  The  design  tree  and  documentation  produced  are 
shown  in  Figures  2-14  and  2-15  (in  abbreviated  form). 

Implementation  Variation 

Often,  there  is  more  than  one  cliched  implementation  of  an  abstract  operation  or  data  type. 
This  can  introduce  variability  between  programs  that  on  a  high  level  of  abstraction  perform 
the  same  abstract  operation  or  use  the  same  abstract  data  types.  It  is  important  that 
GRASPR  be  able  to  recognize  the  same  abstract  cliches  in  these  variations. 

For  example,  the  CST  program  uses  a  FIFO  queue  to  implement  the  queue  of  aessages 
collected  on  each  cycle  of  the  synchronous  simulation  and  then  delivered  on  the  next.  The 
FIFO  queue  is  implemented  as  a  Circular  Indexed  Sequence,  as  shown  in  Figure  2-16. 
However,  another  possible  implementation  of  the  queue  is  a  LIFO  queue  (or  stack),  as 
shown  in  Figure  2-17. 

GRASPR  produces  the  design-tree  shown  in  Figure  2-18  for  the  code  that  uses  this  imple¬ 
mentation.  It  differs  from  the  tree  in  Figure  2-14  only  in  the  subtrees  that  are  highlighted 
by  dotted  boxes  in  the  figure.  The  rest  of  the  tree,  including  the  high-level  description  of 
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CST  sequentially  sinulatea  a  par2dlel  message-passing  system. 

It  is  implemented  as  a  Synchronous  Simulation. 

1:  Synchronous  Simulation  synchronously  simulates  a  collection  of  processing 
nodes  handling  messages.  The  synchronous  nodes  (which  represent  the 
processing  nodes)  are  collected  in  an  address-map,  cabled  *IODES*.  Each 
node  maintains  a  local  buffer  of  pending  messages  to  handle.  Synchronous 
Simulation  is  implemented  as  a  Synchronous  Simulation  using  Global 
Ness2ige  Buffer. 

2:  Synchronous  Simulation  using  Global  Message  Buffer  iteratively  advances 
each  synchronous  node  in  vlQDES*  by  handling  one  message  a  piece.  It  uses 
a  global  message  buffer  to  ensure  that  nodes  advance  in  lock-step.  The 
global  buffer's  initial  value  is  eSTEP-QUEUE* .  The  simulation  starts  by 
adding  an  initial  message  IIIT-NSG  to  eSTEP-QUEUE* .  The  simulation  ends 
when  no  node  has  work  to  do  (i.e.,  no  mors  messages  to  handle)  and  the 
global  message  buffer  *STBP-QUEUEv  is  empty,  is  messages  are  handled,  new 
messages  are  created  which  are  buffered  on  the  global  message  buffer. 
Synchronous  Simulation  using  Global  Message  Buffer  is  composed 
of  a  Queue  Insert,  an  Earliest  Simulation  Finished  and  a  Generate 
Global  Message  Buffers  and  Bodes. 

3:  Queue  Insert  enqueues  IIIT-MSG  on  the  Queue  *STEP-QUEUE* ,  which  is 
implemented  as  a  FIFO.  Queue  Insert  is  implemented  as  a  FIFO  Enqueue. 

4:  FIFO  Enqueue  enqueues  IIIT-MSG  on  the  FIFO  queue  *STEP-QUEUE* , 
which  is  implemented  as  a  Circular  Indexed  Sequence.... 

3:  Earliest  Simulation  Finished  takes  two  input  sequences:  a  sequence 
of  address-maps,  starting  with  *IODES*,  and  a  sequence  of  global 
message  buffers,  starting  with  vSTEP-QUEUE* .  It  outputs  the  first 
address-Buip  in  the  input  sequence  of  address-maps  that  satisfies  the 
predicate  that  all  nodes  in  the  address-map  have  empty  local  buffers 
and  the  correspondug  global  message  buffer  is  empty. 

Earliest  Simulation  Finished  temporally  abstracts  Synchronous 
Simulation  Finished?. 

4:  Iterative  Synchronous  Simulation  Finished  tests  whether  a 
synchronous  simulation  is  finished  by  testing  whether  the 
global  buffer  and  all  of  the  nodes’  local  buffers  are  empty. . . . 

3:  Generate  Global  Message  Buffers  and  lodes  generates  address-maps 
and  global  message  buffers  by  repeatedly  delivering  all 
messages  in  the  global  message  buffer  *STEP-QUEUE«  and 
advancing  the  synchronous  nodes  in  *BODES*  by  one  step  each.... 


Figure  2-15:  A  portion  of  the  documentation  generated  for  CST. 
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the  program  as  a  sequential  simulation,  remains  the  same. 

It  is  impractical  to  enumerate  aU  possible  implementational  variations  of  an  abstract 
cliche  in  the  cliche  library.  The  hierarchical  organization  of  the  cliche  library  allows  imple¬ 
mentation  variation  to  be  represented  compactly. 

Function-Sharing 

Programs  can  vary  widely,  depending  on  which  optimizations  they  make.  A  type  of  opti¬ 
mization  that  occurs  frequently  in  programs  is  one  in  which  two  abstract  cliches  share  some 
functional  part.  In  this  case,  the  implementations  of  the  cliches  overlap.  GRASPR  is  able  to 
recognize  the  two  cliches  in  a  program  whether  or  not  their  implementations  overlap. 

For  example,  one  of  the  things  the  CST  program  does  in  gathering  statistics  is  that  it 
iterates  through  the  nodes  and  computes  the  average  length  of  their  FIFO  queues  before 
it  delivers  messages  on  each  clock  cycle.  Suppose  we  added  the  cliche  to  our  library  that 
performs  this  operation:  it  polls  the  STlCB-IQDEs,  keeps  a  running  total  of  their  local  buffer 
sizes,  and  divides  the  sum  by  the  number  of  SYlCH-IODEs. 

This  cliche  is  found  in  the  current  CST  code  in  the  function  avg-qu«n«-length,  which 
is  called  by  prolile-stap  in  stap-nodas,  as  shown  in  Figure  2-19.  The  recognition  of  this 
cliche  results  in  the  design  tree  shown  in  Figure  2-20.  (This  tree  is  generated  by  GRASPR,  in 
addition  to  the  design  tree  shown  in  Figure  2-14.) 

Figure  2-21  shows  a  variation  of  the  CST  code  in  which  the  function-sharing  optimiza¬ 
tion  has  been  introduced.  In  this  code,  the  average  queue  length  computation  has  been 
moved  into  the  iteration  in  itarativaly-stap-nodas  that  polls  nodas  and  advances  each 
one  in  lock  step.  This  function  is  already  iterating  through  the  nodas.  So,  in  addition  to 
stepping  each  one,  it  has  been  made  to  keep  a  running  total  of  their  local  queue  lengths. 
Its  caller,  stap-nodas,  finishes  off  the  avers^ng  computation.  This  optimization  increases 
the  program’s  efficiency  by  enumerating  the  nodas  only  once. 

GRASPR  is  able  to  recognize  both  the  queue  avers^ng  cliche  and  the  advance  nodes  cliche 
in  this  optimized  program,  even  though  the  implementations  of  the  cliches  overlap.  The 
resulting  design  trees  share  a  sub-tree,  as  shown  in  Figure  2-22. 

Redundancy 

Sometimes  a  part  of  a  cliche  might  appear  more  than  once  in  the  same  instance  of  a  cliche. 
The  repeated  part  is  most  often  some  inexpensive  computation  whose  result  is  needed  more 
than  once.  The  program  may  simply  repeat  this  computation,  rather  than  caching  the 
result  in  a  temporary  variable.  An  example  of  this  occurs  in  the  function  Splic«-in-Back«t 
shown  in  Figure  2-23,  which  is  used  by  a  hash  table  insertion  function  contained  in  PiSia. 
Splic«-in-Back«t  creates  and  inserts  an  entry  into  a  hash  table  bucket,  called  Bucket-List, 
which  is  an  ordered  associative  list.  It  does  this  by  "edr’ing”  down  the  Bnckst-List,  looking 
for  a  place  to  insert  the  new  entry  so  that  the  entries  remain  ordered  with  respect  to  their 
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(dsfiin  cst-start  (init-msg) 

(aand-Bsg  init-asg) 

(shell -go)) 

(defun  deliver-msgs  () 

(cond  ((queue-enpty?  *step-queae*)  nil) 

(t  (multiple-value-bind  (asg  neu-step-queue) 

(dequeue  sstep-queue*) 

(setq  *step-queue*  nee-step-queue) 

...) 

(deliver-msgs ) ) ) ) 

(defstruct  queue 
(head  0) 

(tail  0) 

(length  0) 

(data-size  sdefault-queue-size*) 

(data  (make-array  vdefault-queue-sizee  : adjustable  t))) 

(defun  queue-empty?  (queue) 

(=  (queue-length  queue)  0))) 

(defun  enqueue  (queue  obj) 

(let*  ((length  (queue-length  queue)) 

(old-size  (queue-data-size  queue)) 

(big-enough-queue  (if  (<  length  (1-  old-size)) 

queue 

(grov-queue  queue)))) 

(enqueue-base  big-enough-queue  obj))) 

(defun  enqueue-base  (queue  obj) 

(let  ((old-size  (queue-data-size  queue))) 

(make-queue  :head  (queue-head  queue) 

:tail  (mod  (1-f-  (queue-tail  queue))  old-size) 

: length  (!■«■  (queue-length  queue)) 

: data-size  (queue-data-size  queue) 

;data  (copy-replace-elt  obj 

(queue-tail  queue) 

(queue-data  queue))))) 

(defun  dequeue  (queue) 

(let  ((elt  (aref  (queue-data  queue)  (queue-head  queue)))) 

(setq  queue  (make-queue  :head  (mod  (1*  (queue-head  queue)) 

(queue-data-size  queue)) 

:tail  (queue-tail  queue) 

:length  (1-  (queue-length  queue)) 

: data-size  (queue-data-size  queue) 

:data  (queue-data  queue))) 

(values  elt  queue))) 

Figure  2-16:  Buffer  queue  implemented  as  a  FIFO,  which  in  turn  is  implemented  as  a  CIS. 
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(dafun  qaaua-aapty?  (quaaa) 

(noil  qnaae)) 

(dafiin  enqueue  (queue  obj) 

(cons  obj  queue)) 

(dafun  dequeue  (queue) 

(values  (car  queue) 

(cdr  queue))) 

Figure  2-17:  Buffer  queue  implemented  as  a  stack  (LIFO). 

Key  parts.  If  an  entry  exists  with  the  same  Key  as  the  new  entry  (Key),  then  the  existing 
entry’s  Value  part  is  changed  to  the  new  Value.  luaber-Entrias  keeps  track  of  the  number 
of  entries  in  the  hash  table.  It  is  incremented  only  if  the  new  entry  is  inserted,  not  if  an 
existing  entry  is  changed. 

This  function  repeats  the  computation  of  accessing  the  first  element  of  Bucket -List,  us¬ 
ing  car,  as  indicated  in  the  figure  by  asterisks.  However,  the  cliche  for  Ordered- Associative- 
List-Insert  contains  only  one  part  corresponding  to  these  expressions.  It  matches  more 
closely  the  program  shown  in  Figure  2-24.  GRASPR  is  able  to  recognize  Ordered- Associative- 
List-Insert  in  both  variations. 

2.4  Breadth  of  Coverage 

The  cliches  captured  in  our  library  cover  a  broad  range  of  programs.  The  domain-specific 
cliches  occur  in  programs  in  the  domain  of  sequential  simulation  of  message-passing  parallel 
systems,  while  our  general-purpose  utility  cliches  are  found  in  programs  across  all  domains. 

However,  the  library’s  coverage  is  not  absolute.  Our  “example-driven”  cliche  acquisition 
was  based  on  an  extremely  small  sample  set  of  programs  in  a  particular  domain.  We  make 
no  claims  of  fully  modeling  the  simulation  domain  or  even  the  subset  of  it  that  deals  with 
message-passing  systems.  Also,  our  library  does  not  contain  all  utility  cliches  used  by 
experienced  software  en^neers. 

Despite  these  limitations,  our  library  demonstrates  the  kinds  of  algorithms  and  data 
structures  that  can  be  expressed  within  a  graph  grammar  formalism.  This  formalism  cap¬ 
tures  these  cliches  at  a  level  of  abstraction  that  enables  recognition  by  graph  parsing  to  be 
robust  under  many  common  types  of  program  variations. 
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Figure  2-18:  Design  tree  for  implementational  variation  in  which  the  buffer  is  a  stack 


j,  t  u^,^, 


(dafuii  step-nodes  () 

(when  eprolile*  (profile-step)) 

(iteratively-step-nodes  0) 

...) 

(defun  profile-step  () 

(avg-queue-length) 

...) 

(defun  avg-queue-length  () 

(let  ((tql  0)) 

(setq  tql  (sun-queue-lengths  0  tql)) 

(/  tql  (array-total-size  *nodes*)))) 

(defun  sun-queue-lengths  (z  tql) 

(if  (>=  X  (array-total-size  enodes*)) 
tql 

( sun-queue-lengths 
(1+  x) 

(->-  tql  (queue-length  (node-queue  (get-node  z))))))) 
(defun  iteratively-step-nodes  (z) 

(if  (>=  z  (array-total-size  anodes*)) 
nil 

(step-node  z) 

(iteratively-step-nodes  (1*  z)))) 


r 


Figure  2-19:  Portion  of  CST  that  avers^es  node  queue  lengths. 
Avcnge-Local-BaSn-Size 


EnnMnle-m)iie»f 

Compute-Avenge 


+ 

Figure  2-20:  Design  tree  for  queue  length  averaging  computation. 
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(delua  stap-nodas  () 

(vhan  aprolila*  (prolila-atap)) 

(itarativaly-stap-nodaa  0  0) 

...  (/  atotal-qaaua-langth* 

(axxay-total-siza  anodas*))  ... 

...) 

(dafon  itarativaly-stap-nodas  (z  tql) 

(eond  ((>*  z  (array-total-siza  *nodms*)) 

(satq  atotal-qaaaa-langtha  «ql) 
nil) 

(t  (stap-noda  z) 

(itarativaly-stap-nodas 
(1+  z) 

(+  tql  (qnana-langth  (noda-quana  (gat-noda  z))))))) 

Figure  2-21:  Optimization  in  which  averi^ng  is  performed  while  advancing  nodes. 
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Figure  2-22:  Design  tree  for  optimized  code,  with  shared  sub-tree. 
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(dafun  Splica-In-Buckat  (Value  Key  Bucket-List  Muaber-Entries) 

(cond  ((Bapty-or-Low-Priority-Head?  Kay  Bucket-List) 

(values  (cons  (Make-Entry  ;Key  Key  .-Value  Value) 

Bucket -List) 

(1+  luaber-Entries))) 

((string®  Key 

(Entry-Key  (car  Bucket-List)))  ;;  * 

(values  (cons  (Make-Entry  :Xey  Key  .-Value  Value) 

(cdr  Bucket-List)) 
luaber-Entr ies ) ) 

(t  (nultiple-value-bind  (le«-Bucket-List  lua-Entries) 
(Splice-In-Bucket  Value 
Key 

(cdr  Bucket-List) 
luabar-Entr ies ) 

(values  (cons  (car  Bucket-List)  * 

Me«-Bncket-List ) 
lua-Entries) ) ) ) } 

Figure  2-23:  Code  coataining  a  redundant  CAR  computation. 

(de<nn  Splice-In-Bucket  (Value  Key  Bucket-List  luaber-Entrias) 

(cond  ( (Eapty-or-Lov-Priority-Head?  Key  Bucket-List) 

(values  (eons  (Make-Entry  :Key  Key  : Value  Value) 

Bucket-List) 

(1+  luaber-Entrias) )} 

(t  (let  ((This-Entry  (car  Bucket-List)))  ;;  * 

(cond  ((string®  Key 

(Entry-Key  This-Entry) )  ; ;  • 

(values 

(cons  (Make-Entry  :Key  Key  : Value  Value) 

(cdr  Bucket-List)) 
luaber-Entries) ) 

(t  (aultiple-value-bind  (lev-Buckat-List  lua-Entries) 
(Splice-In-Bucket  Value 
Key 

(cdr  Bucket-List) 
luaber-Entries ) 

(values 

(cons  This-Entry  leu-Bucket-Llst)  ;  * 
lua-Entries))))))))) 

Figure  2-24:  Code  in  which  the  result  of  CAR  is  cached  and  reused. 
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Chapter  3 


The  Flow  Graph  Formalism 


GRASPR  is  able  to  tolerate  many  of  the  common  types  of  program  variations  mentioned 
in  Section  2.3.1  by  using  a  dataflow  graph  representation  for  programs  and  by  using  a 
flow  graph  grammar  to  encode  programming  cliches.  Program  recognition  is  achieved  by 
parsing  the  dataflow  graph  in  accordance  with  the  flow  graph  grammar.  There  are  several 
advantages  to  using  a  graph  grammar  formalism  to  represent  programs  and  cliches: 

•  Quasi-canonical  form.  Dataflow  graphs  abstract  away  irrelevant  syntau:tic  details  and 
give  the  representation  programming-langu<^e  independence. 

•  Localization.  Dataflow  graphs  make  dataflow  dependencies  explicit,  imposing  a  partial 
ordering  on  the  program’s  operations  (rather  than  the  linear,  total  ordering  imposed 
by  text).  The  effect  is  that  patterns  that  are  textually  delocalized  (noncontiguous) 
can  often  become  localized  in  a  flow  graph  where  only  essential  dataflow  relationships 
are  captured. 

•  Compact  representation.  Only  primitive  operations  and  dataflow  between  them  are 
represented  by  the  graph. 

•  Fragmentary  patterns  can  be  represented  without  including  unnecessary  detadls. 

•  Hierarchical  relationships  can  be  drawn  between  graphs,  with  the  graph  grammar 
formalism  providing  a  Arm  mathematical  basis. 

In  this  chapter,  we  define  the  flow  graph  grammar  formalism  used  to  represent  programs 
and  cliches.  We  present  the  basic  formalism  first  and  then  describe  extensions  to  it  that  allow 
us  to  deal  with  variations  due  to  redundancy  versus  structure-sharing,  and  variations  in 
aggregation  organization.  We  then  present  a  chart  parser  for  flow  graphs  in  this  formalism. 
Interleaved  with  the  description  of  the  formalism  are  sections  that  ground  the  description 
in  the  concrete  application  of  program  recognition.  These  may  help  clarify  and  motivate 
the  restrictions  on  flow  graphs  and  graph  grammar  rules.  These  sections  are  unnecessary 
for  understanding  the  general  description  of  the  formalism,  which  has  a  broad  range  of 


applicability  to  other  problem  domains  besides  program  recognition  (as  discussed  in  Section 
7.4).  In  the  final  section,  we  summarize  related  graph  grammar  research. 


3.1  Flow  Graphs 

A  flow  graph  is  an  attributed,  directed,  acyclic  graph,  whose  nodes  have  ports  -  entry  and 
exit  points  for  edges.  Flow  graphs  have  the  foUowing  properties  and  restrictions: 

1.  Each  node  has  a  type  which  is  taken  from  a  vocabulary  of  node  types. 

2.  Each  node  has  two  disjoint  tuples  of  ports,  called  its  inputs  and  outputs.  Each  port 
has  a  type,  taken  from  a  vocabulary  of  port  types.  All  nodes  of  the  same  type  have 
the  same  number  and  type  of  ports  in  their  input  and  output  port  tuples.  The  size 
of  the  input  port  tuple  of  a  node  is  called  the  input  arity  of  the  node,  while  its  output 
arity  is  the  size  of  the  node’s  output  port  tuple. 

3.  A  node’s  inputs  (or  outputs)  may  be  empty,  in  which  case  the  node  is  called  a  source 
(or  sink,  respectively). 

4.  Edges  do  not  merely  adjoin  nodes,  but  rather  edges  adjoin  ports  on  nodes.  All  edges 
run  from  an  output  port  on  one  node  to  an  input  port  on  another  node.  The  ports 
connected  by  an  edge  must  have  the  same  port  type.^  (An  exception  to  this  is  that  a 
port  of  the  special  designated  type  Any  can  connect  to  ports  of  any  type.) 

5.  More  than  one  edge  may  adjoin  the  same  port.  Edges  entering  the  same  input  port 
are  called  fan-in  edges,  while  edges  leaving  a  common  output  port  are  called  fan-out 
edges. 

6.  Ports  need  not  have  edges  adjoining  them.  Any  input  (or  output)  port  in  a  flow  graph 
that  does  not  have  an  edge  running  into  (or  out  of)  it  is  called  an  input  (or  output) 
of  that  graph. 

7.  Each  flow  graph  has  a  vocabulary  of  attributes,  which  is  partitioned  into  two  disjoint 
sets  of  node  attributes  and  edge  attributes.  Each  attribute  has  a  (possibly  infinite) 
set  of  possible  values.  Associated  with  each  node  type  is  a  finite  subset  of  the  node 
attributes.  These  are  the  only  attributes  for  which  nodes  of  that  type  can  hold  values. 
All  edges  hold  a  value  for  each  of  the  edge  attributes. 

Flow  graphs  were  first  defined  by  Brotsky  [15],  drawing  upon  the  earlier  work  on  tveb 
grammars  [27, 94, 102,  105, 119].  Wills  [144, 145]  extended  Brotsky’s  definition  so  that  flow 
graphs  can  include  sinks  and  sources  (item  3  above),  fan-in  and  fan-out  edges  (item  5),  and 
attributes  (item  7). 

’  In  the  future,  a  type  hierarchy  system  may  be  used  to  allow  ports  to  be  connected  if  one  port’s  type  is 
a  subtype  of  the  other’s. 
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Figure  3-1:  An  example  attributed  flow  graph. 

Figure  3-1  shows  an  example  flow  graph.  We  refer  to  nodes  by  their  node  type.  If 
there  are  two  nodes  with  the  same  type,  we  precede  the  node  type  with  a  unique  label. 
Ports  are  identifled  using  numeric  annotations  on  the  nodes.  Each  numeric  port  identifier 
is  followed  by  a  colon  and  the  port’s  type.  The  edges  of  the  flow  graph  have  been  labeled 
with  subscripted  “€”8. 

Edge  cs  connects  two  ports  of  type  <3,  while  edge  €4  connects  a  port  of  type  <4  with  one 
of  type  Any.  Edges  ci  and  cj  fan  out  of  port  2  on  node  6,  while  edges  63  and  ee  fan  into 
port  1  of  node  g.  Node  d  is  a  sink.  Port  1  of  node  6  is  an  input  of  the  graph  and  ports  2 
and  3  of  node  g  are  outputs  of  the  graph.  (Pictorially,  we  emphasize  inputs  and  outputs  of 
the  graph  by  drawing  edge  stubs  adjoining  them.) 

In  the  figure,  attribute-value  pairs  (in  the  form  attribute:value)  are  shown  in  italics  near 
the  node  or  edge  which  holds  a  value  for  the  attribute.  In  this  example,  all  node  types  have 
the  node  attribute  color.  The  node  type  g  additionally  has  the  attributes  age  and  size 
and  the  node  of  type  g  in  this  particular  graph  has  values  15  and  60,  respectively,  for  these 
attributes.  All  edges  have  the  attribute  distance. 

Useful  Definitions 

A  flow  graph  H  is  a,  sub-flow  graph  of  a  flow  graph  G  if  and  only  if  H's  nodes  are  a  subset 
of  G's  nodes,  and  H's  edges  are  the  subset  of  G's  edges  that  connect  only  those  ports  found 
on  nodes  of  H. 

Isomorphism  can  be  defined  between  flow  graphs  using  a  variation  of  its  standard  def¬ 
inition,  which  accounts  for  edges  adjoining  ports,  rather  than  nodes.  Two  flow  graphs  Fi 
and  F2  are  isomorphic  if  and  only  if  there  is  a  one-to-one  mapping  <f>  of  the  nodes  of  F\ 
onto  the  nodes  of  Fj,  such  that  adjacency  is  preserved  -  i.e.,  the  i*^  output  of  a  node  ni  is 
connected  to  the  y**  input  of  a  node  nj  in  F\  if  and  only  if  the  t**  output  of  the  node  ^ni) 
is  connected  to  the  input  of  the  node  in  Fj. 
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3.2  Flow  Graph  Grammars 

A  flow  graph  grammar  is  a  set  of  rewriting  rules  (or  productions),  each  specifying  how  a 
node  in  a  flow  graph  can  be  replaced  by  a  particular  sub-flow  graph.  All  rules  in  a  flow  graph 
grammar  rewrite  a  single  left-hand  side  node  to  a  right-hand  side  flow  graph.  The  grammar 
specifies  which  flow  graphs  are  in  a  particular  set  of  flow  graphs,  called  the  language  of  the 
grammar. 

In  addition,  the  flow  graph  grammar  may  be  attributed;  Each  rule  can  specify  how 
to  compute  attribute  values  of  the  rule’s  nodes  from  the  attributes  of  other  nodes  in  the 
rule.  Each  rule  can  also  impose  constraints  on  the  attributes  of  the  rule’s  nodes.  Every 
flow  graph  in  the  language  of  an  attributed  grammar  has  attribute  values  that  satisfy  the 
constraints  of  the  rules  generating  the  flow  graph. 

More  precisely,  a  flow  graph  grammar  G  has  four  parts:  two  disjoint  sets  N  and  T  of 
node  types,  called  non-terminals  and  terminals,  respectively,  a  set  P  of  productions,  and 
a  set  5  of  distinguished  non-terminal  types,  called  the  start  types  of  G.  (By  convention, 
non-terminal  types  are  denoted  by  capital  letters,  while  terminal  types  are  in  lower  case.) 

Each  production  in  P  consists  of  the  following  five  parts: 

•  A  flow  graph  L,  called  the  left-hand  side,  containing  a  single  node  having  a  non¬ 
terminal  type. 

•  A  flow  graph  R,  called  the  right-hand  side,  containing  nodes  of  non-terminal  or  ter¬ 
minal  types. 

•  An  embedding  relation  C  which  specifies  the  correspondence  between  the  ports  of  L 
and  R. 

•  A  set  of  attribute  conditions,  which  impose  constradnts  (in  the  form  of  relations)  on 
the  attribute  values  of  nodes  and  edges  in  R. 

•  A  set  of  attribute  transfer  rules,  each  of  which  specifies  the  value  of  an  attribute  of 
L’s  node  in  terms  of  the  attributes  of  the  nodes  and  edges  in  R. 

Sections  3.2.1  and  3.2.3  discuss  the  embedding  relation  and  the  attribute  conditions  and 
transfer  rules  in  more  detail. 

3.2.1  Embedding  Relation 

The  embedding  relation  is  necessary  in  flow  graph  grammar  rules  (unlike  string  grammar 
rules)  to  provide  connectivity  information  when  an  occurrence  of  a  left-hand  side  is  rewritten 
during  a  derivation.  It  specifies  how  the  ports  connected  to  the  left-hand  side  should  be 
connected  to  the  right-hand  side  flow  graph,  and  possibly  to  each  other,  when  the  left-hand 
side  is  replaced  by  the  right-hand  side.  (It  is  used  in  an  analogous  way  in  the  reverse  process 
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of  reducing  an  occurrence  of  a  rule’s  right-hand  side  to  its  left-hand  side  during  recognition 
or  parsing.) 

The  embedding  relation  C  is  a  binary  relation  on  CkHu C,  where  C  denotes  the  set  of 
left-hand  side  ports  and  71  denotes  the  set  of  right-hand  side  ports  of  a  rule.  A  left-hand  side 
port  li  and  a  right-hand  side  port  or  another  left-hand  side  port  pj  are  said  to  “correspond” 
if  {li,Pj)  €  C.  The  embedding  relation  is  restricted  in  the  following  ways. 

1.  If  a  left-hand  side  port  corresponds  to  a  right-hand  side  port,  then  both  ports  must 
be  of  the  same  direction  (input  or  output).  If  two  left-hand  side  ports  correspond  to 
each  other,  they  must  be  of  opposite  directions. 

2.  More  than  one  right-hand  side  port  and/or  left-hand  side  port  may  correspond  to 
the  same  left-hand  side  port.  However,  more  than  one  left-hand  side  port  may  not 
correspond  to  the  same  right-hand  side  port. 

3.  Each  left-hand  side  port  corresponds  to  at  least  one  right-hand  side  or  left-hand  side 
port.  (A  right-hand  side  port  need  not  correspond  to  some  left-hand  side  port.) 

The  right-hand  side  ports  corresponding  to  ports  on  the  left-hand  side  node  need  not  be 
inputs  or  outputs  of  the  right-hand  side  graph  (i.e.,  they  may  be  connected  to  other  ports 
in  the  graph). 

The  definition  of  the  embedding  relation  is  extended  (as  described  in  Section  3.4.2)  to 
encode  aggregation  information.  However,  the  extended  relation  still  obeys  these  restric¬ 
tions. 

When  a  left-hand  side  port  /i  corresponds  with  another  left-hand  side  port  /j,  the  rule 
is  said  to  contain  a  straight-through  (abbreviated  “st-thru”).  We  discuss  the  significance  of 
st-thrus  in  the  next  section,  where  we  describe  how  the  embedding  relation  is  used  in  the 
derivation  of  flow  graphs. 

Figure  3-2  shows  an  example  flow  graph  grammar.  In  this  example,  ports  are  referred 
to  as  subscripted  node  types  (e.g.,  oi  refers  to  the  port  labeled  1  on  the  node  with  type  a). 
Port  types  are  not  shown.  The  port  correspondences  of  each  rule  are  indicated  pictorially 
by  matching  Greek  letters.  For  example,  left-hand  side  port  Ai  corresponds  to  right-hand 
side  port  oi.  (This  grammar  does  not  have  attribute  conditions  or  attribute  transfer  ndes, 
so  they  are  not  shown.  See  Section  3.2.3  for  the  details  of  attribute  handling  and  Figure 
3-5  for  a  complete  picture.) 

By  convention,  when  a  port  correspondence  involves  an  internal  right-hand  side  port 
(not  an  input  or  output  of  the  right-hand  side  graph),  we  draw  an  edge  stub  coming  into 
or  out  of  that  port.  We  annotate  the  edge  stub  with  the  port  correspondence  label.  For 
example,  this  is  done  in  drawing  the  rule  for  ron-terminal  A  in  Figure  3-2.  Also,  when 
two  or  more  right-hand  side  ports  correspond  to  the  same  left-hand  side  port,  the  edge 
stubs  from  the  right-hand  side  ports  are  drawn  as  if  they  are  merged  with  each  other.  This 
abbreviated  notation  is  used,  for  example,  in  depicting  the  rule  for  B.  (This  makes  it  easier 
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Figure  3-2:  An  example  flow  graph  grammar. 

to  visualize  how  the  right-hand  side  of  a  rule  is  embedded  into  a  graph  when  the  left-hand 
side  is  expanded  during  derivation.) 

Similarly,  st-thrus  are  depicted  as  lines  which  do  not  adjoin  any  port,  but  which  may 
be  merged  with  an  edge  stub  and/or  another  st-thru.  In  drawings,  they  are  annotated  with 
the  pair  of  correspondence  labels  associated  with  the  left-hand  side  ports  that  correspond. 
The  rule  for  F  contains  a  st-thru,  since  ports  Fi  and  F4  correspond. 

3.2.2  Flow  Graph  Grammar  Derivations 

A  flow  graph  is  derived  from  a  start  type  So  of  a  flow  graph  grammar  by  starting  with  a  flow 
graph  containing  a  single  node  of  type  So  and  repeatedly  applying  the  grammar’s  rewrite 
rules  (productions)  to  the  non-terminals  in  this  graph  until  no  non-terminals  are  left. 

Each  rewrite  rule  specifies  how  an  isomorphic  occurrence  of  the  rule’s  left-hand  side  L 
can  be  replaced  by  the  rule’s  right-hand  side  graph  R.  The  embedding  relation  C  of  the 
rule  is  used  to  embed  R  in  the  graph  once  L  has  been  removed.  In  particular,  for  each 
right-hand  side  port  and  left-hand  side  port  /«  related  by  C,  rj  is  connected  to  all  of  the 
ports  that  were  connected  to  li  before  L  was  removed. 

In  addition,  if  a  left-hand  side  input  port  li  corresponds  to  a  left-hand  side  output  port 
Ij,  then  edges  are  drawn  connecting  each  of  the  ports  connected  to  to  each  of  the  ports 
connected  to  Ij.  In  other  words,  when  a  rule  contains  a  st-thru,  the  embedding  relation 


between  the  ports  involved,  /,  and  Ij,  imposes  the  constraint  tliat  the  ports  adjacent  to 
and  Ij  become  connected  directly  to  each  other  when  the  left-hand  side  is  rewritten. 

for  example,  a  sample  derivation  of  a  graph  from  the  grammar  of  Figure  :t-2  is  shown  in 
Figure  d-.’l.  When  the  non-terminal  node  ^>1  is  expanded  in  the  second  step  of  the  derivation, 
A  is  removed  from  the  graph,  along  with  the  edges  adjoining  its  ports.  Then  the  right-hand 
side  of  tin'  rule  for  A  is  added  to  the  graph.  Finally,  edges  are  drawn  between  the  right-hand 
side  ports  «i,  11^,  and  «2  and  the  ports  to  which  A\,  /I2,  and  /Irj  (respectively)  had  been 
connected  (i.e.,  j?3,  F2,  and 

In  string  grammars,  the  derivation  tree  is  used  as  a  canonical  representation  of  equivalent 
derivations,  which  abstracts  away  from  the  order  in  which  productions  are  applied  in  the 
derivations.  It  is  useful  to  make  use  of  a  similar  representation  for  flow  graph  derivations. 

As  in  the  string  case,  a  derivation  tree  has  vertices  labeled  with  the  node  type  of  a 
non-terminal  that  was  expanded  during  the  derivation.  However,  unlike  the  string  case,  the 
children  of  each  vertex  are  related  in  a  partial  ordering.  The  right-hand  side  graph  in  the 
production  for  the  vertex’s  label  defines  this  partial  ordering.  (Derivation  trees  are  normally 
shown  without  the  edges  between  the  nodes  of  the  tree  to  reduce  clutter.)  For  example,  the 
derivation  sequence  of  Figure  3-3  is  represented  by  the  derivation  tree  of  Figure  3-4. 

3.2.3  Attribute  Conditions  and  Transfer  Rules 

So  far,  we  have  discussed  the  aspects  of  flow  graph  grammars  that  impose  structural  con¬ 
straints  on  the  flow  graphs  in  their  languages,  for  example,  by  constraining  their  node  types 
and  edge  connections.  This  section  describes  how  the  non-structural  aspects  of  a  flow  graph 
are  constrained.  Attributes  are  used  to  represent  information  that  cannot  be  adequately 
expressed  in  the  structure  of  a  flow  graph.  Attribute  conditions  in  grammar  rules  impose 
constraints  on  these  attributes. 

The  concept  of  an  attributed  string  grammar  was  formalized  by  Knuth  [77]  as  a  way  to 
assign  semantics  to  strings  in  a  context  free  language.  Attribute  values  are  computed  from 
other  attribute  values  within  a  rule.  This  is  called  attribute  evaluation.  The  attributes  that 
are  computed  represent  some  aspect  of  the  “meaning”  of  the  string  being  parsed  (e.g.,  the 
decimal  value  of  a  binary  number). 

Since  then,  attribute  grammars  have  been  used  extensively  in  such  areas  as  pattern 
recognition  [16,  17,  39,  48,  86,  135],  compiler  technology  [40,  41,  47,  68,  74  ,  78,  79],  pro¬ 
gramming  environments  [6, 28],  software  specification  and  development  [38, 97, 98, 101, 131], 
and  test  case  generation  [30].  Raiha  [107]  gives  a  bibliography  of  the  early  papers.  These 
systems  use  attribute  grammars  to  deal  with  nonstructural,  semantic  properties  of  a  pat¬ 
tern  and  to  reduce  the  complexity  of  the  grammar.  Much  of  the  theoretical  work  in  this 
area  has  focussed  on  developing  efficient  attribute  evaluation  strategies  [28,  68,  73,  109], 
the  complexity  of  checking  that  attribute  grammars  are  well-formed  [64],  and  assisting  the 
writing  of  attribute  grammars  which  contain  complex  dependencies  among  the  attributes 
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Figure  3-3:  An  example  derivation  sequence. 
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Figure  3-4:  An  example  derivation  tree. 


[29]. 

Our  flow  graph  grammars  are  attributed  grammars  in  the  sense  that  their  productions 
contain  attribute  transfer  rules  for  computing  attribute  values  from  the  attribute  values 
of  other  nodes  and  edges  within  the  rule.  (These  are  also  called  “semantic  rules”  [77], 
“attribute  transfer  functions”  [16],  or  “attribute  transfer  specifications”[145].) 

In  general,  attribute  transfer  rules  can  associate  the  attribute  of  some  node  or  edge  on 
either  side  of  a  rule  with  a  function  for  computing  its  value  from  the  attributes  of  the  other 
nodes  and  edges  (on  either  side)  of  the  rule.  Attributes  that  are  computed  for  the  left-hand 
side  node  from  the  attributes  of  the  right-hand  side  are  called  synthesized  attributes.  Those 
that  are  computed  for  a  right-hand  side  node  or  edge  from  the  attributes  of  the  left-hand 
side  node  and/or  other  nodes  and  edges  in  the  right-hand  side  are  called  inherited  attributes. 

Currently,  the  flow  graph  grammar  used  by  the  recognition  system  uses  only  synthesized 
attributes.  This  is  because  our  attributed  flow  graph  grammars  are  not  used  so  much  for 
computing  attribute  values,  as  for  imposing  constraints  on  the  attributes  of  the  flow  graph 
being  parsed.  Inherited  attributes  are  useful  if  the  value  of  an  attribute  involves  complex 
dependencies  across  the  derivation  tree.  However,  the  attribute  values  computed  in  the 
current  system  are  based  on  simple  relationships  among  attributes.  Synthesized  attributes 
are  adequate. 

Constrsunts  are  imposed  on  attributes  in  the  form  of  attribute  conditions  on  grammar 
rules.  Attribute  conditions  are  relations  on  the  attribute  values  of  the  nodes  and  edges  of  a 
flow  graph  grammar  rule’s  right-hand  side.  They  specify  constraints  that  must  be  satisfled 
by  the  attributes  of  a  flow  graph  if  it  is  in  the  language  of  the  grammar.  (These  are  also 
called  “context  conditions”  [68],  “constraints”  [145],  and  “applicability  predicates” [16].) 

The  attribute  conditions  and  attribute  transfer  rules  of  a  production  are  used  primarily 
during  parsing.  (They  can  be  used  during  generation  to  produce  a  set  of  conditions  that 
must  be  satisfled  by  the  attribute  values  of  the  flow  graph  generated.  However,  this  is  not 
how  they  are  typically  used.) 

A  parser  for  an  attributed  grammar  engages  in  the  following  three  activities  when  given 
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Attribute-Conditions: 
Colorib)  =  ColoriA)  =  Colorig) 


Attribute-Transfer  Rules: 

Size(S>,  =  10  Sizrig)  ^  Age(g) 
ColoiiS)  :=  ColofiA) 


Attribute-Conditions: 

Distance(«i2  <  Distance(<h^  .  >) 

Attribute-Transfer  Rules: 

ColoriA)  :=J(Color(a),  Colorfh)) 


Figure  3-5:  An  example  attributed  flow  graph  grammar. 


a  string  (or  graph,  in  the  case  of  attributed  graph  grammars)  x: 

1.  Structural  analysis  -  recover  a  derivation  of  x  from  a  start  type  of  the  grammar  and 
create  a  derivation  tree  to  represent  the  derivation.  If  no  derivation  tree  is  found, 
reject  x  for  membership  in  the  language  of  the  grammar.  (This  is  the  usual  activity 
performed  by  recognizers  for  non-attributed  grammars.) 

2.  Attribute  evaluation  -  propagate  attribute  values  throughout  the  derivation  tree  in 
accordance  with  the  attribute  transfer  rules.  Values  for  synthesized  attributes  move 
upward  as  a  function  of  the  attribute  values  of  the  descendants  of  a  node,  while 
inherited  attribute  values  move  downward  from  the  ancestors. 

3.  Attribute  condition  checking  -  maintain  the  invariant  that  if  all  attribute  values  are 
known  for  the  attributes  related  by  an  attribute  condition,  then  the  condition  must 
hold.  If  a  condition  fails  to  hold,  reject  x. 

If  the  recognizer  finishes  with  an  attributed  derivation  tree  for  x  and  all  attribute  con¬ 
ditions  of  all  productions  involved  are  satisfied,  then  x  is  recognized  as  a  member  of  the 
language. 

For  example.  Figure  3-6  shows  the  derivation  tree  that  would  result  from  parsing  the 
attributed  flow  graph  in  Figure  3-1  in  accordance  with  the  grammar  of  Figure  3-5.  The 
edges  are  drawn  between  the  leaves  of  the  derivation  tree  to  show  the  edge  attributes  that 
are  involved  in  the  parse.  Dashed  arrows  show  the  propagation  of  attribute  values. 

The  three  parsing  activities  can  be  interleaved.  The  interleaving  is  particularly  simple 
in  our  parser,  since  only  synthesized  attributes  are  used.  All  attribute  values  of  a  derivation 
node  depend  only  on  the  attributes  of  the  node's  descendants.  Attribute  conditions  can 
be  checked  as  soon  as  the  right-hand  side  of  a  rule  is  recognized.  Attribute  values  can 
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Figure  3-6:  An  attributed  derivation  tree. 


be  computed  and  transferred  to  the  left-hand  side  node  during  the  reduction  of  the  right- 
hand  side  to  the  left-hand  side.  Because  the  attribute  condition  checking  is  folded  into  the 
structural  parsing  process  (i.e.,  conditions  are  checked  each  time  a  reduction  is  attempted), 
invalid  parses  can  be  cut  off  early. 

In  the  future,  if  inherited  attributes  are  needed,  a  more  sophisticated  attribute  evaluation 
and  condition  checking  strategy  will  need  to  be  employed  (for  example  [28,  68,  73,  109]). 

3.3  Motivations  for  Formalism:  Program  Recognition  Ap¬ 
plication 

So  far,  the  basics  of  the  flow  graph  formalism  have  been  described.  There  are  two  major 
extensions  to  this  formalism  that  increase  the  class  of  flow  graphs  and  grammars  that  can 
be  succinctly  expressed  in  it.  However,  before  they  are  described,  this  section  briefly  shows 
how  the  basic  formalism  is  used  in  a  particular  application  domain.  This  provides  some 
rationale  for  the  restrictions  on  the  grammar  formalism  that  have  been  described  so  far. 
(This  section  is  not  needed  to  understand  the  extensions.  It  may  be  read  after  the  extensions 
have  been  discussed.) 

We  apply  the  flow  graph  formalism  to  the  representation  of  programs  and  programming 
cliches.  In  particular,  flow  graphs  serve  as  graphical  abstractions  of  programs,  flow  graph 
grammars  encode  allowable  implementation  steps  between  abstract  operations  and  lower- 
level  operations,  and  the  derivation  trees  resulting  from  parsing  give  the  program’s  top-down 
design. 


(DEFUl  RIGHTP  (HYPOTEIUSE  SIDEl  SIDE2) 

(LET*  ((HYP-SQ  (SQ  HYPOTEIUSE)) 

(DIFF  (-  HYP-SQ 

(-1-  (SQ  SIDEl) 

(SQ  SIDE2)))) 

(DELTA  (IF  (<  DIFF  0) 

(lEGATE  DIFF) 

DIFF))) 

(IF  (<=  DELTA  (*  HYP-SQ  0.02)) 

T 

■ID)) 

Figure  3-7:  Testing  whether  the  three  input  sides  form  a  right  triangle. 

The  flow  graph  is  used  to  represent  the  operations  of  a  program  and  the  dataflow  between 
them.  Each  non-sink  node  in  a  flow  graph  represents  a  function,  with  ports  on  the  node 
representing  distinct  inputs  and  outputs  of  the  function.  The  ports’  types  are  determined 
by  the  signature  of  the  function.  Sink  nodes  represent  conditional  tests.  The  edges  of  a 
flow  graph  represent  dataflow  constraints  between  the  functions  and  tests.  When  the  result 
of  a  function  is  consumed  by  more  than  one  function,  the  edges  representing  the  dataflow 
fan  out.  Edges  that  fan  in  represent  the  conditional  merging  of  more  than  one  dataflow. 

For  example.  Figure  3-8  shows  the  flow  graph  representing  the  code  shown  in  Figure 
3-7.^  RIGHTP  determines  whether  the  inputs  could  be  the  lengths  of  the  sides  of  a  right 
triangle.  It  checks  whether  the  square  of  HYPOTEIUSE  is  approximately  equal  to  the  sum  of 
the  squares  of  SIDEl  and  SIDE2. 

Two  special  nodes  of  type  $B$  and  $E$,  which  are  not  in  N  UT  cap  the  ends  of  the 
flow  graph.  These  hold  ports  that  represent  the  input  and  output  values  of  data  consumed 
or  produced  by  the  code.  These  nodes  make  it  easy  to  represent  the  fan-out  of  input  data 
to  more  than  one  function  and  the  conditional  fan-in  of  output  data.  For  example,  port  1 
on  $£$  receives  fan-in  representing  the  conditional  output  of  either  constant  T  or  MIL. 

Attributes  on  nodes  and  edges  are  used  to  capture  characteristics  of  a  program  that 
cannot  be  adequately  expressed  in  the  structure  of  a  flow  graph.  Control  flow  information 
ib  stored  in  the  attributes  of  the  flow  graph  representing  a  program.  Each  node  has  a 
control  environment  attribute  whose  value  indicates  under  which  conditions  the  operation 
represented  by  the  node  is  executed.  Nodes  in  the  same  control  environment  represent 
functions  that  are  all  executed  under  the  same  conditions.  (Section  4.1.1  describes  the 
vocabulary  of  attributes  and  attribute  conditions  used  by  the  recognition  system  in  more 
detail.) 

Sink  nodes,  representing  conditional  tests,  carry  two  additional  attributes,  success-ce 
^Tke  function  RIGHTP  is  taken  from  Problem  3-9  (p.42)  in  [148]. 
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Figure  3-8:  Attributed  flow  graph  for  RI6HTP. 

and  failure-ce.  These  specify  the  control  environments  whose  operations  are  executed  when 
the  conditional  test  succeeds  or  fails,  respectively. 

Each  edge  holds  a  ce-from  attribute  which  indicates  the  control  environment  in  which 
the  edge  carries  dataflow.  (In  Figure  3-8,  only  ce-from  attributes  of  edges  that  fan-in  are 
shown,  to  reduce  clutter.  The  edges  that  do  not  fan-in  all  have  ce^  as  their  ce-firom  attribute 
value.) 

Each  edge  also  carries  a  constant-type  attribute  whose  value  is  either  a  constant  (such  as 
T,  IIL,  0)  or  undefined,  depending  on  whether  the  edge  represents  dataflow  from  a  constant. 
For  edges  whose  source  is  not  a  port  on  node  $B$,  the  constant  type  is  always  undefined. 
This  attribute  is  not  shown  in  Figure  3-8  for  edges  for  which  its  value  is  undefined. 

Program  cliches  are  encoded  in  flow  graph  grammar  rules.  Informally,  a  rule  can  be  seen 
as  specifying  how  an  abstract  operation,  reprinted  by  the  rule’s  left-hand  side  node,  is  im¬ 
plemented  in  terms  of  lower-level  operations,  represented  by  the  right-hand  side  flow  graph. 
(Section  4.1  gives  more  details  of  how  this  is  done,  as  well  as  other  relationships  between 
cliches,  besides  implementation  relationships,  which  are  captured  in  grammar  rules.) 

Figure  3-9  shows  a  grammar  containing  a  rule  that  represents  the  common  cliche  of 
testing  whether  two  numbers  are  within  some  "epsilon”  of  each  other.  The  rules  representing 
two  common  implementations  of  the  Absolute  Value  cliche  demonstrate  that  the  grammar 
allows  us  to  modularly  specify  implementation  variations.  The  rules  have  typical  embedding 
relations.  In  the  rule  for  Negate-if-Negative,  two  right-hand  side  ports  (<i  and  negatei) 
correspond  to  the  same  left-hand  side  port.  This  represents  the  constraint  that  the  input 
to  an  isomorphic  instance  of  the  right-hand  side  must  come  from  a  source  that  fans  out  to 
both  <1  and  negatei. 

The  rule  for  Negate-if-Negative  also  has  a  right-hand  side  port  (<2)  that  does  not 
correspond  to  any  left-hand  side  port.  This  right-hand  side  port  represents  the  input  coming 
from  the  constant  0.  It  is  important  that  in  our  formalism  a  right-hand  side  port  is  not 
required  to  correspond  to  a  left-hand  side  port,  since  otherwise  we  would  have  to  add  an 
input  to  Negate-if-Negative  to  correspond  to  <2.  This  would  destroy  the  modularity  of  the 
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Attribute-Tranrfer  Rules: 
ce ;«  ce(niiU-test). 
success-ce  ;s  /aiiure-ce(null-tett). 
failure-ct  ;=  tiicceu~ce(niiU-ieit). 


Attribute-Tranrfer  Rules: 
ce  ;=  ce(Negate-^-Segative). 


Attribuie-Trasufer  Rules: 

ce  ce(Sqiiare-Root~of-Sguare). 


(O.P) 


Attribute— Conditions: 

1.  Second  input  recenet  constant  type  =  0. 

2.  Data/lowtoat^m  "negate'’ infailure-ce(null-lettX 

3.  Data  flows  straight-through  from  input  u  output  in  tuccett-ce(nuU-lett). 


Attribute -Transfer  Rules: 
ce ;»  ce(HiUl-tett). 


Attribute-Transfer  Rules: 

ce  :■  ceiSQRT). 


Figure  3-9:  Flow  graph  grammar  encoding  cliches  found  in  RIGHTP. 
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grammar,  since  the  extra  input  must  be  propagated  up  through  the  rules  that  use  Negate-if- 
Negative.  We  would  need  to  atdd  an  input  to  the  Absolute- Value  node,  but  this  extra  input 
would  be  meaningless  for  Absolute- Value’s  other  implementation  as  Square- Root-of-Square. 

The  rule  for  Negate-if-Negative  also  shows  how  st-thrus  are  used  to  represent  cliched 
operations  in  which  some  of  the  input  data  is  not  acted  upon,  but  passes  directly  to  the 
output. 

This  grammar  also  shows  typical  attribute  conditions  and  attribute  transfer  rules. 
(These  are  stated  informally  in  English  in  Figure  3-9.  Section  4.1.1  gives  a  more  formal 
description  of  the  actual  attribute  language  used  in  encoding  cliches.)  A  typical  attribute 
condition  placed  on  an  edge’s  attribute  in  a  grammar  rule  is  that  it  must  carry  dataflow  in 
a  particular  control  environment  (e.g.,  the  failure-ce  of  some  test). 

Attribute  conditions  and  transfer  rules  may  refer  to  attributes  of  nodes  and  edges  of  the 
rule’s  right-hand  side.  In  addition,  they  may  refer  to  edges  in  the  input  graph  whose  sources 
or  sinks  match  the  inputs  or  outputs  of  the  rule’s  right-hand  side,  or  to  edges  matching  st- 
thrus.  For  example,  the  rule  for  Negate-if-Negative  constrains  the  input  to  <2  to  come  from 
a  constant  source  of  type  0.  It  also  constrains  the  ce-from  attribute  of  edges  whose  sources 
match  negate^  and  of  edges  matching  the  st-thru. 

3.3.1  The  Partial  Program  Recognition  Problem 

We  formulate  the  problem  of  recognizing  cliche  in  programs  in  terms  of  solving  a  parsing 
problem  for  flow  graphs.  This  section  defines  these  problems. 

The  parsing  problem  for  flow  graphs  is:  Given  a  flow  graph  F  and  a  flow  graph  grammar 
G,  if  F  is  in  th**  language  of  G,  then  produce  all  possible  parses  for  F  (i.e.,  aJl  possible 
derivation  trees  that  yield  F). 

The  subgraph  parsing  problem  for  flow  graphs  is:  Given  a  flow  graph  F  and  a  flow  graph 
grammar  G,  find  all  possible  parses  of  all  sub-flow  graphs  of  F  that  are  in  the  language  of 
G. 

There  are  two  types  of  program  recognition:  total,  in  which  the  entire  program  is  rec¬ 
ognized  as  a  single  cliche,  and  partial,  in  which  the  program  may  contain  unrecognizable 
parts  but  as  much  of  the  program  as  possible  is  recognized  as  one  or  more  cliches. 

The  total  recognition  problem  for  programs  is:  Given  a  program  and  library  of  cliches, 
determine  which  cliches  in  the  library  are  instantiated  by  the  program  as  a  whole.  (Usually 
a  single  program  is  recognizable  as  an  instance  of  only  one  cliche,  but  this  general  definition 
includes  cases  in  which  a  program  can  be  viewed  in  more  than  one  way.) 

The  partial  recognition  problem  is:  Given  a  program  and  a  library  of  cliches,  find  all 
instances  of  the  cliches  in  the  program  (i.e.,  determine  which  cliches  are  in  the  program  and 
their  locations). 

In  this  work,  we  are  more  interested  in  the  partial  recognition  problem  for  programs. 
(The  total  recognition  problem  is  subsumed  by  it.)  When  we  say  “program  recognition”  we 
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Figure  3-10:  Cliches  recognized  in  RIGHTP. 
mean  partial  program  recognition. 

The  partial  program  recognition  problem  is  solved  by  formulating  it  as  a  subgraph 
parsing  problem:  Given  a  flow  graph  F  representing  the  program’s  dataflow  and  a  cliche 
library  encoded  as  a  flow  graph  grammar  G  (with  all  non-terminals  that  represent  cliches 
as  start  types),  solve  the  subgraph  parsing  problem  on  F  and  G. 

The  derivation  trees  that  are  produced  are  called  design  trees.  The  root  of  the  tree 
identifies  a  particular  cliche  that  was  recognized  and  the  yield  of  the  tree  indicates  where 
the  cliche  was  found.  Intermediate  non-terminals  in  the  tree  indicate  the  subcliches  that 
implement  the  cliche  that  was  found.  Thus,  casting  partial  program  recognition  as  a  parsing 
problem  yields  as  output  not  only  the  set  of  cliches  and  their  locations,  but  also  relationships 
between  the  cliche  instances. 

For  example.  Figure  3-10  shows  the  design  tree  produced  by  partially  recognizing  the 
program  RIGHTP,  represented  as  the  flow  graph  in  Figure  3-8  and  using  the  graph  grammar 
of  Figure  3-9. 

When  a  program  is  partially  recognized,  one  or  more  sub-flow  graphs  of  the  program’s 
flow  graph  encoding  are  recognized  as  members  of  the  language  of  the  graph  grammar  which 
encodes  the  cliche  library.  From  the  definition  of  a  sub-flow  graph,  we  can  see  that  it  is 
possible  to  ignore  portions  of  a  flow  graph  before  and  after  a  recognizable  sub-flow  graph, 
as  well  as  portions  that  fan  out  from  or  into  an  internal  port  in  the  sub-flow  graph. 


3.4  Extensions  to  the  Flow  Graph  Formalism 

The  next  two  sections  discuss  two  major  extensions  to  the  flow  graph  grammar  formalism 
described  so  far.  The  first  extension  follows  closely  an  extension  made  by  Lutz  [90]  to  a 
graph  formalism  similar  to  ours,  while  the  second  is  novel  to  our  research.  The  extensions 
are  the  following. 


1.  We  expand  the  language  of  a  flow  graph  grammar  to  include  all  flow  graphs  derivable 
not  only  from  a  start  type  of  the  flow  graph  grammar,  but  also  from  flow  graphs  that 
are  “share-equivalent”  to  a  sentential  form^  of  the  grammar.  The  notion  of  share- 
equivalence  captures  the  types  of  variation  due  to  structure-sharing  that  the  extended 
formalism  abstracts  away.  In  a  structure-sharing  flow  graph,  a  node  plays  the  role 
of  more  than  one  node  of  the  same  type  by  generating  output  that  fans  out  or  by 
receiving  input  that  fans  in. 

2.  We  extend  the  expressiveness  of  the  flow  graph  grammar  to  allow  it  to  capture  the 
rewriting  of  a  single  input  (or  output)  of  a  non-terminal  node  into  an  aggregation  of 
inputs  (or  outputs)  of  a  sub-flow  graph.  We  then  further  expand  the  language  of  a 
flow  graph  grammar  to  include  all  flow  graphs  that  are  “aggregation-equivalent”  to 
the  flow  graphs  derivable  from  the  grammar.  The  notion  of  aggregation-equivalence 
deflnes  the  variation  tolerated  in  how  aggregates  are  organized. 

In  the  program  recognition  appUcation,  the  first  extension  is  needed  to  deal  with  varia¬ 
tion  due  to  the  common  engineering  optimization  of  function-sharing.  The  second  extension 
is  important  in  being  able  to  represent  and  recognize  cliched  operations  on  aggregate  data 
structures. 

These  extensions  to  the  formalism  are  described  in  this  section.  However,  the  mecha¬ 
nisms  by  which  the  parsing  problem  is  solved  for  flow  graphs  in  the  extended  formalism 
are  described  in  Section  3.5,  after  the  parsing  process  for  the  basic  unextended  formalism 
is  presented. 

We  make  these  extensions  to  remove  some  forms  of  variation  between  semantically  equiv¬ 
alent  programs  that  are  not  abstracted  away  by  the  graph  representation  alone.  We  essen¬ 
tially  do  this  by  imposing  an  equivalence  relation  on  the  graphs  representing  the  programs. 
Alternatively,  we  could  impose  the  equivalence  relation  at  the  source  text  level  by  trans¬ 
forming  program  expressions  directly.  For  example,  a  great  deal  of  work  has  been  done  in 
the  term  rewriting  area  [60,  61,  75].  These  techniques  are  good  for  canonicalizing  localized 
parts  of  a  program  (e.g.,  by  algebraic  simplification  and  normalization).  However,  if  the 
expression  that  we  want  to  rewrite  is  delocalized  and  interleaved  with  unrelated  expres¬ 
sions,  we  need  to  first  apply  subexpression  shuffling  and  copying  transformations  to  localize 
it.  This  is  avoided  in  the  graph  representation  which  tends  to  localize  related  operations. 
Expression-based  techniques  also  fall  prey  to  syntactic  variation.  It  would  be  useful  to 
combine  the  expression-based  rewriting  techniques  with  graph-based  parsing.  One  way  is 
to  canonicaUze  the  text  as  much  as  possible  first  and  then  convert  to  the  graph-based  repre¬ 
sentation  and  parse.  Another  is  to  interleave  the  two  (maintaining  multiple  representations) 
so  that  expression-based  simplifications  and  normalizations  can  be  done  to  aid  recognition 
and  the  graph-based  representation  can  localize  expressions  to  rewrite  and  abstract  away 

$entential  form  of  a  graph  grammar  is  any  flow  graph  that  is  derivable  from  a  start  type  of  the 
grammar  by  the  application  of  zero  or  more  productions  of  the  grammar. 
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Figure  3-11:  These  flow  graphs  should  all  be  seen  as  equivalent, 
syntactic  diflferences. 

3.4.1  Structure-Sharing 

Flow  graphs  can  be  used  to  represent  collections  of  components  having  inputs  and  outputs 
that  are  produced  or  consumed  by  each  other.  In  using  this  representation,  we  would  like 
to  be  able  to  view  a  flow  graph  in  which  two  or  more  components  of  the  same  type  are 
collapsed  into  a  single  shared  component  as  being  equivalent  to  a  flow  graph  in  which  the 
two  components  are  not  collapsed.  See  Figure  3-11. 

This  is  important  in  dealing  with  variation  due  to  function-sharing,  in  engineering  ap¬ 
plications  of  the  formalism.  Function-sharing  is  a  common  engineering  optimization  made 
during  design,  in  which  one  component  fulfills  more  than  one  purpose.  For  example,  in  an 
optimized  program,  two  or  more  functions  may  be  applied  to  the  result  of  a  single  (shared) 
function  application. 

We  employ  a  notion  of  share-equivalence  to  capture  the  relationship  between  flow  graphs, 
such  as  those  in  Figure  3-11.  This  notion  was  introduced  by  Lutz  [90]  for  graphs  similar  to 
ours.  Share-equivalence  is  defined  in  terms  of  a  binary  relation  collapses  (denoted  <1 )  on 
flow  graphs.  Flow  graph  collapses  flow  graph  F2  if  and  only  if  there  are  two  nodes  nj 
and  nj  of  the  same  node  type  t  in  F2,  having  input  arity  I  and  output  arity  O,  such  that 
all  of  these  conditions  hold: 

1.  Either  one  or  both  of  the  following  are  true: 

(a)  Vi  =  1..,/,  the  i*^  input  port  of  n\  is  connected  to  the  same  set  of  output  ports 
as  the  input  port  of  n2. 

(b)  Vj  =  1...0,  the  output  port  of  ni  is  connected  to  the  same  set  of  input  ports 
as  the  output  port  of  n2. 

2.  Fi  can  be  created  from  F2  by  replacing  7*1  and  n2  with  a  new  node  na  of  type  t  with 
the  i‘^  input  (resp.,  output)  of  connected  to  the  union  of  the  ports  connected  to 
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Figure  3-12:  a)  A  grammar,  b)  Its  core  language,  c)  Some  flow  graphs  in  its  expanded 
language. 

the  inputs  (resp.,  outputs)  of  ni  and  n2. 

3.  The  attribute  values  of  ni  and  nj  can  be  “combined.”  This  is  done  by  applying  an 
attribute  combination  function,  which  is  defined  for  each  attribute,  to  the  attribute 
values  of  ni  and  n2.  The  attribute  combination  functions  may  be  partial  functions.  If 
the  function  is  not  defined  for  ni  and  n2’s  attributes,  then  the  attribute  values  cannot 
be  combined  (and  Fi  does  not  collapse  F2). 

For  example,  in  Figure  3-11,  Fi  collapses  F2  which  coUapses  F3.  Performing  the  trans¬ 
formation  in  condition  2  from  F2  to  Fi  is  called  “zipping  up”  Fj.  Its  inverse  is  referred  to 
as  “unzipping”. 

The  reflexive,  symmetric,  transitive  closure  of  collapses,  <1*,  defines  the  equivalence 
relation  share-equivalent.  (In  Figure  3-11,  Fi,  F2,  and  F3  are  all  share-equivalent.) 

The  directly  derives  relation  (^)  between  flow  graphs  is  redefined  as  follows.  A  flow 
graph  Fi  directly  derives  another  flow  graph  Fi  if  and  only  if  either  Fi  can  be  produced  by 
applying  a  grammar  rule  to  Fi,  Fi  <1  Fi,  or  F2  <1  Fi. 

As  in  string  grammars,  the  reflexive,  transitive  closure  of  is  the  derives  relation  (=»*). 
The  language  of  a  flow  graph  grammar  G  (denoted  L{G))  is  the  set  of  all  flow  graphs,  whose 
nodes  are  of  terminal  type  and  which  can  be  derived  from  a  start  type  of  G. 

Thus,  the  notion  of  a  language  of  a  flow  graph  grammar  G  has  been  extended  to  include 
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Figure  3-13:  a)  A  grammar,  b)  A  derivation  sequence,  c)  A  derivation  graph  representing 
the  derivation. 


flow  graphs  that  are  generated  by  a  series  of  not  only  production  rule  applications  but 
also  zip-up  and  unzipping  transformations.  Since  a  zip-up  or  unzipping  step  can  happen 
anywhere  in  the  derivation  sequence,  the  language  of  a  graph  grammar  G  in  this  extended 
formalism  is  a  superset  of  the  set  of  flow  graphs  share-equivalent  to  flow  graphs  in  the 
“core”  language  of  G  in  the  unextended  formalism.  For  example,  the  flow  graphs  in  Figure 
3-12c  are  included  in  the  language  of  the  grammar  in  Figure  3- 12a,  even  though  they  are 
not  share-equivalent  to  either  of  the  flow  graphs  in  the  grammar’s  core  language,  shown  in 
Figure  3- 12b. 

Both  generators  and  parsers  for  the  language  of  a  flow  graph  grammar  can  interleave 
zipping  and  unzipping  transformation  steps  with  their  usual  expansion  and  reduction  steps. 
The  parser  used  by  the  program  recognition  system  reported  here  simulates  the  introduction 
of  these  transformations  into  its  reduction  sequence,  as  is  described  in  Section  3.5.1. 


Structure-Sharing  Derivation  “Trees” 

The  extensions  to  the  language  of  a  flow  graph  grammar  affect  how  equivalent  derivat’on 
sequences  are  captured  in  a  single  canonical  tree  representation.  Because  flow  graph  zip-up 
can  occur  as  part  of  a  derivation  sequence  and  this  results  in  a  shared  subderivation,  the 
representation  of  a  derivation  as  a  tree  is  no  longer  possible.  Derivations  must  be  represented 
as  graphs.  For  example,  see  Figure  3-13. 

In  addition,  there  may  be  different  derivation  graphs,  depending  on  when  unzipping 
is  done  in  the  derivation  sequence.  For  example.  Figure  3- 14a  shows  a  simple  flow  graph 


78 


4 


(a) 


-di:: 


— ©c 


(C) 

Figure  3-14:  (a)  A  grammar,  (b)  Two  derivations  of  same  flow  graph,  (c)  Two  derivation 
graphs  representing  the  derivations. 
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grammar  and  Figure  3- 14b  gives  two  possible  derivation  sequences.  In  the  first  sequence, 
the  unzipping  transformation  happens  in  the  second  step.  In  the  second  derivation  se¬ 
quence,  this  transformation  happens  in  the  third  step.  An  unzipping  step  is  represented  in 
a  derivation  graph  by  a  vertex  that  is  a  group  of  instances  of  that  vertex,  each  with  its  own 
sub- derivation.  The  two  derivation  sequences  are  represented  by  the  two  derivation  graphs 
in  Figure  3- 14c. 

We  arbitrarily  choose  those  derivation  graphs  as  canonical  that  represent  derivation 
sequences  in  which  unzipping  occurs  at  the  earliest  possible  moment  in  the  derivation  se¬ 
quence  (i.e.,  unzip  a  non-terminal  before  it  is  expanded).  In  our  example,  the  derivation 
graph  on  the  left  is  taken  as  canonical. 

3.4.2  Aggregation 

Grammar  rules  in  our  flow  graph  formalism  specify  how  a  non-terminal  node  can  be  rewrit¬ 
ten  as  a  particular  grouping  of  terminal  and  non-terminal  nodes  (in  the  form  of  a  flow 
graph).  We  now  extend  it  to  also  specify  how  a  single  input  or  output  of  a  non- terminal 
node  can  correspond  to  an  aggregation  of  the  inputs  or  outputs  of  a  flow  graph  to  which 
the  non-terminal  node  is  rewritten. 

In  engineering  application  domains,  this  is  useful  in  representing  not  only  how  aggrega¬ 
tions  of  components  make  up  a  higher-level  component,  but  also  how  the  inputs  and  outputs 
of  the  components  are  aggregated  into  fewer,  more  abstract  types  of  inputs  and  outputs 
of  the  higher-level  component.  In  the  programming  domain,  for  example,  the  Circular  In¬ 
dexed  Sequence  Insert  cliche  has  two  inputs:  an  element  to  insert  and  a  cliched  aggregate 
data  structure  (the  Circular  Indexed  Sequence).  The  insert  is  implemented  by  a  group  of 
primitive  operations  with  several  of  their  inputs  representing  the  various  parts  aggregated 
by  the  single  Circular  Indexed  Sequence  data  type. 

This  section  first  considers  a  way  to  capture  the  aggregation  of  port  types  without 
extending  the  formalism.  This  is  found  to  be  too  intolerant  of  the  variation  that  may 
exist  in  the  way  port  types  are  aggregated.  However,  it  provides  useful  insights  into  what  is 
required  to  handle  the  variation.  In  particular,  a  notion  of  aggregation-equivalence  is  defined 
to  relate  flow  graphs  that  differ  only  in  how  they  aggregate  port  types.  The  language  of  a 
flow  graph  grammar  is  expanded  to  consist  of  all  flow  graphs  aggregation-equivalent  to  flow 
graphs  derivable  from  a  start  type  of  the  grammar. 

Using  Make  and  Spread  Nodes 

This  section  sets  up  a  straw  man  which  is  a  simple  way  to  capture  the  aggregation  of 
port  types  into  a  single,  more  abstraM;t  port  type  without  extending  the  graph  grammar 
formalism.  This  technique  will  work  in  restricted  cases.  However,  as  the  next  section 
shows,  it  is  too  intolerant  of  variations  in  the  organization  of  aggregates. 

A  simple  way  to  capture  the  aggregation  of  port  types  into  fewer,  more  abstract  port 
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types  is  to  use  special  nodes,  called  Make  and  Spread  nodes.  A  Make  node  represents  the 
aggregation  of  input  port  types  into  the  output  port  type,  while  a  Spreaul  node  represents 
the  decomposition  of  the  input  port  type  into  the  output  port  types. 

Each  Make  node  has  a  tuple  of  input  ports  whose  types  compose  the  type  of  the  Make’s 
single  output  port.  The  node  type  of  a  Make  node  is  defined  by  the  ordered  tuple  of  its 
output  ports’  types  and  its  aggregate  input  port’s  type.  Two  Make  nodes  match  if  they 
collect  the  same  tuple  of  input  port  types  into  the  same  aggregate  output  port  type.  Spread 
nodes  are  analogous  to  Make  nodes,  but  have  a  single  input  port  of  aggregate  port  type 
and  a  tuple  of  output  ports  which  have  types  composing  the  input  port’s  type. 

Make  and  Spread  node  types  come  in  pairs,  called  corresponding  pairs.  For  each  Make 
node  type,  there  is  a  corresponding  Spread  node  type  (and  vice  versa)  for  the  same  aggregate 
type,  such  that  the  input  of  the  Make  corresponds  to  the  i*^  output  of  the  Spread  in  that 
they  have  the  same  port  type  and  represent  the  same  part  of  the  aggregate  port  type. 

Using  Make  and  Spread  nodes,  we  can  now  write  production  rules  such  as  the  ones 
shown  in  the  grammar  of  Figure  3-15.  For  example,  in  the  right-hand  side  of  the  rule  for 
A,  Spread  and  Make  nodes  explicitly  show  how  the  inputs  and  outputs  of  nodes  a  and  b 
are  aggregated  into  the  abstract  port  type  P.  This  port  type  is  the  type  of  both  the  input 
and  the  output  of  the  left-hand  side  node  A.  These  types  of  rule  require  no  extension  to 
the  graph  grammar  formalism  describe  in  Section  3.2.  F\  in  Figure  3-16  is  the  (only)  flow 
graph  in  the  language  of  the  grammar  in  Figure  3-15. 

To  simplify  the  discussion,  we  assume  right-hand  sides  only  have  Spreads  and  Makes 
on  fringes  and  that  no  nesting  of  Spreads  or  Makes  occurs  on  any  right-hand  side.  A  flow 
graph  grammar  can  always  be  transformed  so  that  this  is  true. 

We  also  assume  that  abstraction  monotonically  increases  as  we  move  up  through  the 
grammar  rules.  Left-hand  side  port  types  are  always  either  aggregates  of  (i.e.,  more  ab¬ 
stract  than)  their  corresponding  right-hand  side  port  types  or  are  of  the  same  type  as  their 
corresponding  right-hand  side  port  types.  Right-hand  side  port  types  are  never  aggregates 
of  left-hand  side  port  types.  This  means  no  flow  graph  in  the  language  of  a  flow  graph 
grammar  has  inputs  going  to  a  Make  node  or  outputs  coming  from  a  Spread  node. 

Problems  Due  to  the  Inflexibility  of  Makes  and  Spreads 

The  flow  graph  F\  in  Figure  3-16  is  the  only  one  derivable  from  the  start  type  5.  However, 
we  would  like  to  expand  the  language  of  the  grammar  to  include  flow  graphs  that  differ 
from  this  one  solely  in  the  way  port  types  are  aggregated  within  the  graph.  In  particular, 
the  organization  of  aggregated  port  types  may  vary  in  any  of  the  following  ways: 

1.  Port  types  may  be  aggregated  in  any  order,  since  aggregation  is  commutative.  For 
example,  flow  graph  F2  in  Figure  3-16  aggregates  types  x  and  y  into  P  in  the  opposite 
order  in  which  Fi  does. 
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Figure  3-16:  F\  is  the  flow  graph  in  the  language  of  the  grammar  in  Figure  3-15.  The  rest 
are  flow  graphs  aggregation-equivalent  to  it. 
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2.  Aggregations  of  port  types  may  be  nested  within  other  aggregations  and  the  organi¬ 
zation  of  this  nesting  does  not  matter,  since  aggregation  is  associative.  For  example, 
flow  graph  F3  aggregates  y  and  w  into  type  R  and  then  aggregates  x  and  R,  while  Fi 
groups  together  x  and  y  into  P  which  is  then  aggregated  with  w. 

3.  Port  types  might  not  be  aggregated  at  all.  For  example,  flow  graph  F4  is  a  variation  of 
flow  graph  Fj  in  which  no  aggregation  is  done.  A  special  case  of  this  type  of  variation 
is  the  variation  due  to  the  choice  of  which  compositions  of  Spreads  with  Makes  (and 
vice  versa)  to  simplify.  For  example,  flow  graph  F5  results  from  the  simplification  of 
Fi ’s  composition  of  a  Spread  with  a  Make. 

Aggregation- Equivalence 

We  would  like  the  flow  graphs  F2,...,F5  to  be  in  the  language  of  the  grammar  of  Figure 
3-15,  not  just  Fj.  To  describe  the  relationship  between  these  flow  graphs,  we  define  the 
equivalence  relation  aggregation-equivalent  on  flow  graphs. 

First,  we  need  to  define  the  following  terms. 

•  A  Make-of-Spread  composition  is  a  Spread  node  connected  to  a  Make  node  of  cor¬ 
responding  type  via  edges  between  their  corresponding  part  type  ports.  More  pre¬ 
cisely,  a  Make-of-Spread  is  a  corresponding  pair  of  Make  and  Spread  nodes,  such  that 
V*  =  1, ...,  m,  the  output  of  the  Spread  node  connects  directly  to  the  input  of 
the  Make  node  and  there  are  no  other  edges  adjoining  these  ports  (where  m  is  the 
number  of  part  port  types  aggregated). 

•  A  Spread-of-Make  composition  is  analogous.  It  is  a  Make  node  connected  to  a  Spread 
node  of  corresponding  type  via  an  edge  between  the  Make’s  output  port  and  the 
Spread’s  input  port. 

Now  we  can  define  the  reflexive,  symmetric,  transitive  relation  aggregation-equivalent. 
A  flow  graph  Fj  is  aggregation-eqmvalent  to  another  Fj  (denoted  Fj  =a  F2)  if  and  only  if 
there  exists  a  flow  graph  F3,  such  that  F\  and  F2  can  each  be  transformed  to  a  flow  graph 
isomorphic  to  F3,  using  a  (possibly  empty)  sequence  of  the  following  transformations: 

1.  For  some  corresponding  padr  of  Spread  and  Make  node  types,  Ts  and  Tm,  permute  the 
outputs  of  all  (Spread)  nodes  of  type  Ts  and  the  inputs  of  all  (Make)  nodes  of  type 
Tm,  keeping  connections  intact  and  using  the  same  permutation  for  all  the  Spreads 
and  Makes.  (The  flow  graphs  Fi  and  F2  in  Figure  3-16  can  be  transformed  into  each 
other  using  this  transformation.) 

2.  For  all  compositions  of  Spread  nodes,  replace  the  composition  sub-flow  graph  with  a 
single  Spread  whose  output  arity,  m,  is  the  number  of  outputs  of  the  sub-flow  graph 
and  Vt  =  l,...,m,  the  output  of  the  new  Spread  has  the  same  port  type  amd 
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Figure  3-17:  F3  and  Fi  can  be  transformed  to  this  flow  graph  by  flattening  nested  Makes 
and  Spreads. 

connections  as  the  output  of  the  sub-flow  graph.  Flatten  all  compositions  of  Make 
nodes  analogously.  (This  can  be  used  to  transform  Fi  to  F$  (shown  in  Figure  3-17) 
and  F3  to  Fe,  so  Fi  =a  F3  in  Figure  3-16.) 

3.  For  any  Make-of-Spread  composition,  replace  the  Make-of-Spread  composition  with 
edges  from  the  ports  adjacent  to  the  input  of  the  Spread  to  the  ports  adjacent  to  the 
output  of  the  Make. 

4.  For  any  Spread-of-Make  composition,  replace  the  Spread-of-Make  composition  with 
new  edges  drawn  in  the  following  way:  V*  =  1, ...,  m  connect  the  ports  adjacent  to  the 
i‘'‘  input  of  the  Make  to  the  ports  adjacent  to  the  i*'*'  output  of  the  Spread  (where 
m  =  the  Make’s  input  arity  =  the  Spread’s  output  arity).  (F5  results  from  applying 
this  transformation  to  Fi  in  Figure  3-16.) 

5.  Remove  any  Spread  node  whose  input  is  an  input  of  the  flow  graph  and  remove  any 
Make  node  whose  output  is  an  output  of  the  flow  graph.  (F5  can  be  transformed  to 
F4  by  using  this  transformation  and  by  removing  the  Spread-of-Make  composition.) 

Transformations  1  and  2  allow  variation  due  to  commutativity  and  associativity  of  ag¬ 
gregation,  respectively,  while  conditions  3  and  4  allow  variability  in  the  simpliflcation  of 
Spread-Make  compositions.  Transformation  5  is  needed  to  aUow  flow  graphs,  like  F4,  that 
use  no  aggregation  to  be  in  the  language  of  a  grammar  that  aggregates  port  typ^. 

We  wiU  call  the  first  transformation  the  permutation  transformation,  since  it  permutes 
the  part  port  tuples  of  Makes  and  Spreads.  The  rest  of  the  transformations  are  aggregation- 
removal  transformations.  We  will  call  the  inverse  of  aggregation-removal  transformations 
f^ggregation-introduction  transformations,  since  they  insert  Spreads  and  Makes  into  a  flow 
graph. 

We  can  use  the  aggregation-equivalence  relation  to  expand  what  we  mean  by  the  lan¬ 
guage  of  a  flow  graph  grammar.  If  we  call  the  set  of  flow  graphs  derivable  from  the  graph 
grammar  (using  the  “derives”  relation  defined  in  Section  3.4.1)  the  “core”  language  of  the 
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grammar,  then  we  can  define  the  language  of  the  grammar  to  consist  of  all  flow  graphs 
aggregation-equivalent  to  flow  graphs  in  the  core  language. 

Useful  Definitions  and  Facts 

A  flow  graph  Fi  is  said  to  be  less-aggregated  than  another  F2  if  and  only  if  Fi  can  be 
generated  from  F2  by  applying  any  of  the  aggregation-removal  transformations  above.  This 
relation  is  transitive.  If  there  is  no  flow  graph  less-aggregated  than  a  flow  graph  F,  then  F 
is  said  to  be  minimally-aggregated. 

There  is  only  one  minimally-a^regated  flow  graph  less- aggregated  than  or  isomorphic 
to  a  particular  flow  graph  that  can  be  obtained  by  the  aggregation-removal  transformations. 
(However,  there  may  be  more  than  one  minimally-aggregated  flow  graph  less-aggregated  or 
isomorphic  to  a  particular  flow  graph  F  that  is  s^regation-equivalent  to  F.  These  can  be 
transformed  into  one  another  by  applying  the  permutation  transformation.) 

Whether  the  minimally-aggregated  flow  graph  has  any  Spreads  or  Makes  depends  on 
whether  the  formalism  allows  ports  on  terminal  nodes  to  have  aggregate  port  types.  If 
terminal  nodes  have  no  ports  of  aggregate  type,  then  minimally-aggregated  flow  graphs  will 
have  no  Spreads  or  Makes. 

To  see  this,  suppose  we  have  a  minimally-aggregated  flow  graph  F,  with  a  Spread  or 
Make  node  n.  The  node  n  cannot  be  on  F’s  fringe  since  otherwise  it  could  be  removed 
by  IVansformation  5  to  create  a  flow  graph  less-aggregated  than  F.  So,  n  must  be  an 
internal  node.  It  must  abo  be  flat  (i.e.,  it  is  not  nested  with  another  Spread  or  Make  node), 
since  otherwise  Transformation  2  could  be  applied  to  create  a  less-aggregate  flow  graph. 
Since  n  is  internal,  its  s^regate  port  pi  is  connected  to  another  port  p2,  which  must  be  of 
aggregate  port  type.  However,  p2  must  be  the  aggregate  port  of  a  node  of  corresponding 
Make  or  Spread  type,  since  only  Spreads  and  Makes  can  have  ports  of  aggregate  type.  This 
would  mean  F  contains  a  Spread-of-Make  composition,  which  means  F  is  not  minimally- 
aggregated.  Therefore,  a  minimally-aggregated  flow  graph  cannot  contain  a  Spread  or  Make 
node  if  there  are  no  aggregate  port  types  allowed  on  terminal  nodes. 

On  the  other  hand,  if  terminal  nodes  have  ports  of  aggregate  type,  then  minimally- 
aggregated  flow  graphs  might  have  one  or  more  Spread  or  Make  nodes.  Using  reasoning 
similar  to  that  above,  we  can  see  that  all  Spread  or  Make  nodes  would  be  internal  and  flat, 
with  their  aggregate  port  connected  to  ports  on  terminal  nodes  that  are  not  Spread  or  Make 
nodes. 

These  facts  are  useful  in  developing  a  recognizer  for  languages  of  flow  graph  grammars 
that  aggregate  port  types. 

Recognizing  Aggregation-Equivalent  Flow  Graphs 

A  generator  or  parser  for  the  language  of  a  flow  graph  grammar  may  perform  the  permu¬ 
tation,  aggregation-introduction  and  aggregation-removal  transformations  as  steps  in  their 
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derivation  or  reduction  sequence.  Because  there  are  many  possible  orderings  in  which  to 
apply  the  transformations  and  because  doing  this  efficiently  involves  an  extension  to  the 
embedding  relation  of  the  graph  grammar  formalism,  it  is  important  to  discuss  how  such  a 
recognizer  is  constructed.  (A  generator  for  the  language  is  not  described  here,  since  we  are 
more  interested  in  building  recognizers  for  languages  than  we  are  in  constructing  language 
generators,  for  the  purposes  of  program  recognition.  A  generator  can  easily  be  imagined  by 
reversing  the  recognition  process.) 

One  way  a  recognizer  for  the  language  can  work,  given  an  input  flow  graph  F,  is  in  two 
stages.  The  first  would  apply  some  sequence  of  the  permutation,  aggregation-removal  and 
aggregation-introduction  transformations  to  F  to  produce  a  flow  graph  F',  while  the  second 
would  apply  a  recognizer  for  the  core  language  to  F'.  A  flow  graph  F  would  be  recognized 
if  a  sequence  of  transformations  is  found  which  yields  a  new  flow  graph  F'  that  is  accepted 
by  a  recognizer  for  the  core  language.  Unfortunately,  the  first  stage  could  involve  a  great 
deal  of  search  to  find  the  appropriate  transformation  sequence. 

A  more  promising  approach  is  to  divide  up  the  stages  differently  so  that  no  choices  need 
to  be  made.  In  the  first  stage  only  aggregation-removal  transformations  that  work  “down¬ 
ward”  by  creating  less- aggregated  flow  graphs  are  applied  until  a  minimally-aggregated  flow 
graph  is  obtained.  Then  in  the  second  stage,  the  s^regation-introduction  and  permutation 
transformations  are  interleaved  with  the  reduction  actions  of  the  recognizer  for  the  core 
language.  The  idea  is  that  the  grammar  rul«  can  provide  guidance  as  to  what  to  aggregate 
and  how  to  organize  the  aggregation  so  that  the  flow  graph  will  be  recognizable  as  a  member 
of  the  core  language.  The  aggregation  guidance  is  found  in  the  Spreads  and  Makes  of  the 
rule’s  right-hand  side.  This  section  gives  the  detsuls  of  how  the  interleaving  of  recognition 
with  aggregation-introduction  transformations  works. 

This  is  explained  first  for  a  restricted  formafism  in  which  no  terminaJ  nodes  have  ports  of 
aggregate  port  type  and  the  union  port  type  iny  is  a  union  of  only  primitive  (non-aggregate) 
port  types.  This  simplifies  the  discussion  since  each  minimally-aggregated  flow  graph  in  the 
language  of  the  graph  grammar  contains  no  Spreads  or  Makes. 

Then  a  second  formalism  is  considered  in  which  the  restriction  is  relaxed  to  allow  the 
type  Any  to  be  a  union  of  all  port  types  (including  aggregate  port  types).  This  formalism 
is  still  restricted  in  that  the  only  (possibly)  aggregate  port  type  a  (non-Spread,  non-Make) 
terminal  node’s  port  may  have  is  Any.  In  this  case,  the  minimally-aggregated  flow  graphs 
in  the  graph  grammar’s  language  might  contain  Spreads  and  Makes.  However,  as  discussed 
above,  these  Spreads  and  Makes  will  each  be  flat  and  internal.  Each  Spread  node  must  have 
its  input  aggregate  port  connected  to  a  port  of  type  Any.  The  same  must  be  true  for  each 
Make  node’s  output  aggregate  port. 
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(DEFUI  P0P-'nHCE2  (STK) 

(LET*  ((FIRST  (AREF  (STACK-ELTS  STK) 

(STACK-PTR  STK))) 

(lEW-STK  (MAKE-STACK  :ELTS  (STACK-ELTS  STK) 

:PTR  (1+  (STACK-PTR  STK)))) 

(SECOID  (AREF  (STACK-ELTS  KEU-STX) 

(STACK-PTR  lEtf-STK))) 

(lEVER-STK  (MAKE-STACK  :ELTS  (STACK-ELTS  lEU-STK) 

:PTR  (1+  (STACK-PTR  lEW-STK))))) 
(VALUES  FIRST  SECOID  lEVfER-STK))) 

(DEFUI  POP-TWICE  (A  I) 

(LET*  ((FIRST  (AREF  A  I)) 

(lEW-I  (1+  D) 

(SECOID  (AREF  A  lEV-I)) 

(lEVER-I  (1*  lEU-I))) 

(VALUES  FIRST  SECOID  A  lEVER-I))) 


Figure  3-18:  Two  prograuns  each  performing  two  consecutive  Stack  Pops, 

What  the  Restrictions  Mean  in  the  Program  Recognition  Application 

These  two  restricted  formalisms  are  sufficient  for  capturing  the  types  of  ^gregation  that 
arise  in  dataflow  graphs  representing  programs  that  operate  on  aggregate  data  structures. 

Allowing  only  non-aggregate  port  types  on  terminals,  although  restrictive,  is  still  very 
useful  in  representing  a  wide  class  of  programs  and  cliches  in  the  program  recognition 
dommn.  For  example,  the  minimally  aggregated  flow  graph  for  both  of  the  programs  shown 
in  Figure  3-18  is  given  in  Figure  3-19.  (Attributes  are  not  shown.)  Each  program  can  be 
recognized  as  a  Stack  Pop,  followed  immediately  by  another  Stack  Pop,  where  the  Stack  is 
implemented  as  an  Indexed  Sequence  aggregate  data  cliche  whose  parts  are  an  Index  (am 
integer)  and  a  Base  (a  sequence). 

(When  we  create  the  minimally-aggregated  flow  graph  representing  a  program  that  uses 
user-defined  aggregate  data  structures,  we  remove  Spread  and  Make  nodes,  which  contain 
naming  information  that  is  useful  for  presenting  the  results  of  recognition.  We  convert  this 
information  to  amother  form  (attributes).  See  Section  4.2.3  for  a  discussion  of  how  this 
information  is  used.) 

The  second  less-restrictive  formalism  is  useful  in  representing  programs  in  which  ag¬ 
gregate  data  structures  are  collected  into  primitive  data  types  such  as  amrays  amd  lists  (in 
Common  Lisp).  The  amcessors  and  constructors  of  these  primitive  data  types  (e.g.,  CAR, 
COIS,  AREF)  aire  primitives.  They  cannot  be  treated  like  Spreads  or  Makes  of  aggregate  data 
structures  that  have  fixed,  nauned  pau-ts,  because  thrir  “parts”  are  accessed  and  inserted 
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Figure  3-19:  The  flow  graph  for  the  programs  POP-TWICE  and  P0P-TWICE2. 


Figure  3-20;  Flow  graph  with  a  node  whose  output  port  is  of  type  Any. 

at  variable,  computed  positions.  These  primitive  accessors  and  constructors  have  ports  of 
type  iny. 

For  example,  the  code  fragment  (>  (Bvwt-TiB*  (car  Evant-Quane) ) )  is  part 

of  a  program  for  inserting  a  user-deflned  data  structure,  caUed  an  Event,  into  a  Priority 
Queue  which  is  implemented  as  an  Ordered  Associative  List.  The  Event  has  parts  Time 
(an  integer)  and  Object  (a  Message,  which  is  a  user-deflned  type).  The  Event  is  treated  as 
a  priority  queue  element,  whose  priority  is  the  Time  part.  This  code  fragment  is  testing 
whether  the  first  element  of  the  input  list,  Event-Queue,  has  a  Time  part  less  than  the  value 
of  Ira-Tiae  (which  is  the  Time  of  the  event  being  inserted). 

The  attributed  flow  graph  representing  this  code  fragment  is  shown  in  Figure  3-20.  Its 
CAR  has  an  output  of  type  Any.  (Rather  than  numeric  port  labels,  the  Spread  in  this  example 
uses  mnemonic  names,  such  as  Time,  for  clarity.) 

No  Aggregate  Port  Types  on  Terminals 

This  section  shows  how  the  actions  of  a  recognizer  for  the  core  language  are  interleaved 
with  aggregation-introduction  transformations  in  a  formalism  that  does  not  allow  ports  of 
aggregate  type  on  terminal  nodes. 

Since  minimally-aggregated  graphs  have  no  Spreads  or  Makes,  the  Spreads  and  Makes 
in  the  right-hand  sides  of  rules  cannot  be  matched.  Only  a  sub-flow  graph  of  the  right- 
hand  side  can  be  matched  to  nodes  in  the  input  graph.  This  sub-flow  graph,  called  the 
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non-aggregated  rhs,  consists  of  the  subset  of  nodes  that  are  not  Spreads  or  Makes  and  the 
subset  of  edges  connecting  their  ports. 

Since  right-hand  sides  of  rules  are  assumed  to  contain  no  internal  Spreads  and  Makes, 
the  non-aggregated  rhs  is  the  right-hand  side  graph  minus  its  boundary  Spreads  and  Makes. 
These  boundary  Spreads  and  Makes  contain  valuable  information  about  how  the  inputs  and 
outputs  of  the  non-aggregated  rhs  should  be  aggregated  to  recognize  a  left-hand  side  that 
has  aggregate  port  types.  We  move  this  information  into  the  embedding  relation.  We 
remove  the  boundary  Spreads  and  Makes  so  the  right-hand  side  of  each  graph  grammar 
rule  becomes  the  non-aggregated  rhs. 

Recall  that  the  embedding  relation,  as  described  so  far,  relates  left-hand  side  ports  to 
right-hand  side  ports  and  other  left-hand  side  ports.  (That  is,  C  is  a  binary  relation  on 
CxTZu  C,  where  C  and  H  are  the  sets  of  left-  and  right-hand  side  ports,  respectively.)  A 
single  left-hand  side  port  can  correspond  to  a  non-empty  set  of  right-hand  side  and  left-hand 
side  ports,  while  a  single  right-hand  side  port  can  correspond  to  at  most  one  left-hand  side 
port. 

We  extend  this  embedding  relation  to  relate  each  left-hand  side  port  to  a  tuple  of  right- 
hand  side  and  left-hand  side  port  sets,  where  the  position  in  the  tuple  is  significant.  More 
precisely,  the  embedding  relation  C  is  now  on  £  x  (2^^^)”  where  n  varies.  (A  left-hand  side 
port  and  each  right-hand  side  port  in  the  tuple  related  to  it  are  still  said  to  “correspond” 
with  each  other.) 

The  right-hand  side  ports  are  tupled  and  related  to  the  left-hand  side  ports  based  on 
the  fringe  Spread  and  Make  nodes  that  are  removed  from  each  rule’s  right-hand  side.  When 
a  Spread  node  of  output  arity  m  is  removed,  the  left-hand  side  input  port  corresponding 
to  its  input  port  becomes  related  to  a  tuple  in  which  Vi  =  l,...,m  the  i‘^  element  of  the 
tuple  is  the  set  of  right-hand  side  ports  (if  any)  connected  to  the  i*^  output  of  the  Spread. 
Similarly,  when  a  Make  node  of  input  arity  m  is  removed,  the  left-hand  side  output  port 
corresponding  to  its  output  becomes  related  to  a  tuple,  in  which  Vi  =  1, ..,  m,  the  i‘^  element 
of  the  tuple  is  the  set  of  right-hand  side  ports  (if  any)  connected  to  the  i‘^  input  of  the 
Make. 

The  rule  for  A  in  Figure  3-21a  becomes  the  rule  shown  in  Figure  3-21b  when  Spreads 
and  Makes  are  removed.  Left-hand  side  port  Ai  is  related  to  the  tuple  of  right-hand  side 
ports  <  {ai,<(i},6i  >.  This  is  shown  by  tupling  the  Greek  annotations  associated  with  each 
left-hand  side  port  to  reflect  the  aggregation  of  right-hand  side  ports  corresponding  to  the 
left-hand  side  port.  (For  simplicity,  elements  of  tuples  that  are  singleton  sets  degenerate  to 
the  single  element  of  the  set  in  drawings.  Tuples  containing  one  element  degenerate  to  that 
one  element.) 

If  any  Spread  node  has  an  output  j  that  connects  directly  to  an  input  A;  of  a  Make  node, 
then  a  st-thru  restilts  between  the  left-hand  side  ports  (/]  and  /j)  that  originally  corre¬ 
sponded  with  the  input  of  the  Spread  and  the  output  of  the  Make,  respectively.  Specifically, 
the  element  of  the  tuple  corresponding  with  /i  contains  I3  and  the  element  of  the 
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Figure  3-21:  (a)  A  rule  which  aggregates  port  types,  (b)  The  same  rule  with  aggregation 
information  moved  to  the  embedding  relation. 

tuple  corresponding  with  I2  contains  1%. 

This  is  illustrated  in  Figure  3-22  where  the  rule  in  part  (a)  is  converted  to  the  rule  of 
part  (b)  which  contains  a  st-thru.  Ai  corresponds  with  in  part  y  of  aggregate  port  type 
P. 


Relation  To  Concrete  Application  Domain:  St-Thrus  in  Data  Aggregation 

This  caise  arises  quite  frequently  in  the  program  recognition  domain.  Operations  on  ag¬ 
gregate  data  structures  in  which  all  parts  of  the  data  structure  are  used  and/or  changed 
are  rare  in  the  simulator  programs.  Most  operations  work  on  only  a  subset  of  the  parts. 
For  example,  the  operation  for  removing  the  first  element  from  the  cliched  <iggregate  data 
structure  Circular  Indexed  Sequence  (abbrev.  CIS)  accesses  only  four  of  its  five  parts  and 
changes  only  two  parts.  As  shown  in  Figure  3-23,  the  CIS  data  structure  has  a  Base,  which 
is  a  sequence,  a  Size,  which  is  an  integer,  a  Fill-Count,  which  is  an  integer  count  of  the 
number  of  elements  in  the  CIS,  and  two  index  pointers  (First  and  Last),  which  are  positive 
integers  that  specify  the  indices  of  the  first  and  last  elements  in  the  CIS.  The  removal  op¬ 
eration  uses  the  CIS’s  First  part  as  an  index  into  its  Base  part  to  retrieve  the  first  element. 
Then  the  First  part  is  updated  by  being  incremented  or  decremented  (depending  on  the 
direction  of  growth),  modulo  the  Size  part.  The  Fill-Count  is  also  decremented.  The  Last 
part  is  not  used  or  changed.  Also,  the  Base  and  Size  parts  are  used  but  not  changed.  So, 
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(b) 

Figure  3-22:  (a)  An  edge  connects  a  Spread  and  Make,  (b)  This  edge  beconaes  a  st-thru 
when  aggregation  information  is  moved  to  the  embedding  relation. 

there  are  three  st-thrus  in  the  rule  for  CIS  Extract,  representing  the  Last,  Base,  and  Size 
parts.  The  rule  for  CIS  Extract  is  shown  in  Figure  3-24.  (The  CIS  part  names  correspond¬ 
ing  to  the  elements  of  the  tuples  of  correspondence  labels  are  shown  in  the  lower  left-hand 
corner.) 

Using  the  Embedding  Relation  in  Reduction 

The  embedding  relation  plays  a  key  role  in  reduction  which  is  at  the  heart  of  the  recognition 
process.  A  flow  graph  is  recognized  if  it  can  be  reduced  to  a  single  node  having  a  start  type. 
Reduction  steps  are  analogous  to  rewriting  (or  generation)  steps.  Rather  than  rewriting 
an  occurrence  of  the  left-hand  side  of  a  rule  to  a  sub-flow  graph  isomorphic  to  the  rule’s 
right-hand  side,  we  reduce  an  isomorphic  occurrence  of  the  right-hand  side  to  an  instance 
of  the  left-hand  side.  In  both  cases,  the  embedding  relation  is  used  to  determine  how  to 
connect  the  replacement  sub-flow  graph  to  the  rest  of  the  graph,  called  the  host  graph. 

The  following  is  only  a  conceptual  description  of  the  reduction  mechanism.  While  a 
recognizer  can  be  implemented  to  perform  exactly  these  actions,  it  is  not  necessary  that 
it  do  so.  In  most  generators,  recognizers,  and  parsers,  the  flow  graph  is  not  destructively 
transformed  at  each  derivation  or  reduction  step.  The  rewriting  or  reduction  is  simulated 
in  the  state  of  the  generator,  recognizer,  or  parser.  This  allows  backtracking  and  multiple 
results  to  be  formed  (e.g.,  for  ambiguous  grammars). 

Recall  that  the  unextended  embedding  relation  is  used  as  follows.  When  a  sub-flow 
graph  R  is  reduced  to  an  instance  of  a  rule’s  left-hand  side  L,  an  edge  is  created  between  a 
port  Pi  in  the  host  graph  and  a  port  Lj  of  L,  if  and  only  if  pi  was  connected  to  a  port  in  R 
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Figure  3-23:  Circular  Indexed  Sequence  data  structure. 
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1:  Integer  Decrement  2:  Integer 


(8.1) 


Figure  3-24:  The  rule  for  Circular  Indexed  Sequence  Extract. 
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that  corresponds  to  Lj,  according  to  the  embedding  relation. 

Reduction  using  the  extended  embedding  relation  is  more  complicated.  Several  right- 
hand  side  ports  may  correspond  to  the  same  left-hand  side  port,  but  we  do  not  want  all  ports 
in  the  host  graph  that  are  connected  to  these  right-hand  side  ports  to  become  connected  to 
the  left-hand  side  port  when  the  right-hand  side  is  replaced  with  the  left-hand  side.  Instead, 
before  we  connect  the  left-hand  side  instance  up  to  the  ports  of  the  host  graph,  we  insert 
Make  and  Spread  nodes  into  the  graph  surrounding  the  left-hand  side  to  bundle  up  the 
inputs  and  outputs  coming  from  or  going  to  the  ports  of  the  host  graph. 

More  specificaUy,  for  each  left-hand  side  input  port  Lj  having  an  aggregate  port  type, 
a  Make  node  is  inserted.  Its  output  is  connected  to  Lj  and  its  input  is  connected  to 
the  host  graph  ports  that  are  connected  to  the  right-hand  side  ports  in  the  element  of 
the  tuple  corresponding  to  Lj.  Likewise,  for  each  left-hand  side  output  port  L^  having  an 
aggregate  port  type,  a  Spread  node  is  inserted.  L/t  is  connected  to  the  Spread’s  input  and 
the  output  of  the  Spread  is  connected  to  the  host  graph  ports  that  are  connected  to  the 
right-hand  side  ports  in  the  element  of  the  tuple  corresponding  to  Lfc. 

The  Make  and  Spread  nodes  specify  how  the  minimally-aggregated  flow  graph  should 
be  aggregated  to  recognize  it  as  the  left-hand  side  of  the  rule.  When  the  reduction  results  in 
a  Make-of-Spread  composition,  the  composition  is  simplified.  (Note  that  Spread-of-Makes 
are  never  created  by  this  action.) 

For  example,  the  flow  graph  grammar  of  Figure  3-15,  which  expresses  aggregation  using 
Spreads  and  Makes,  is  converted  to  the  flow  graph  grammar  of  Figure  3-25,  which  expresses 
aggregation  in  the  embedding  relation.  A  sample  reduction  sequence  using  the  rules  of  this 
grammar  is  shown  in  Figure  3-26. 

A  flow  graph  is  recognized  if  it  is  reduced  to  a  flow  graph  consisting  of  node  of  a  start 
type  of  the  grammar,  with  (possibly  empty)  trees  of  nested  Makes  and  Spreads,  whose  roots 
are  connected  to  the  start  type  node’s  Inputs  and  outputs,  respectively. 

The  reduction  transformation  described  here  is  simulated  by  our  parser.  Spreads  and 
Makes  are  not  actually  added  to  the  graph  being  parsed  (just  as  the  graph  being  parsed  is 
not  destructively  reduced).  Section  3.5.2  gives  details  of  how  the  parser  does  this  simulation. 

No  Aggregate  Port  Types  on  Terminals  Except  “Any” 

We  now  slightly  relax  the  restriction  on  our  formalism  that  no  terminal  nodes  have  ports 
of  an  aggregate  type.  We  allow  ports  of  type  Any  on  terminal  nodes  to  take  on  any  port 
type,  including  an  aggregate  port  type.  In  this  formalism,  the  minimally-aggregated  flow 
graphs  in  a  graph  grammar’s  language  might  contain  Spreads  and  Makes  which  are  flat  and 
internal.  We  call  these  residual  Spreads  or  Makes.  Each  residual  Spread  node  must  have  its 
input  aggregate  port  connected  to  a  port  of  type  Any.  Likewise,  the  output  aggregate  port 
on  each  residual  Make  node  must  connect  to  a  port  of  type  Any. 

The  main  difference  this  makes  to  the  reduction  mechanism  is  that  the  simplification 
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Figure  3-25:  The  grammar  of  Figure  3-15  with  aggregation  encoded  in  the  embedding 
relation. 
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Figure  3-26:  A  reduction  sequence  using  the  grammar  of  Figure  3-25. 
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Figure  3-27:  The  reduction  of  a  sub-flow  graph  using  the  rule  for  D  from  Figure  3-25. 

of  Spreads  and  Makes  is  not  as  straightforward.  When  a  sub-flow  graph  isomorphic  to  the 
right-hand  side  is  reduced  to  a  left-hand  side  with  surrounding  Makes  and  Spreads,  the 
Makes  and  Spreads  may  become  connected  to  residual  Spreads  and  Makes. 

A  composition  of  a  Make  with  a  Spread  node  may  arise.  However,  the  Make  and  Spread 
will  not  usually  be  of  corresponding  type.  The  residual  Make  or  Spread  may  even  become 
connected  to  a  tree  of  nested  Spreads  or  Makes,  respectively.  The  usual,  straightforward 
Make-of-Spread  simpliflcation  cannot  be  applied  to  this  composition. 

For  example,  the  sub-flow  graph  containing  nodes  a,  6,  and  c  in  Figure  3-27a  is  reduced 
to  a  non-terminal  node  of  type  D,  surrounded  by  Makes  and  Spreads,  using  the  rule  for  D 
from  Figure  3-25.  The  result  of  the  reduction  is  shown  in  Figure  3-27b. 

There  are  two  solutions  to  this.  One  is  built  on  the  other  and  is  more  powerful  in  that 
it  allows  a  useful  form  of  partial  recognition  to  be  done.  The  basic  solution  is  to  perform 
a  special-case  simpliflcation  to  the  composition.  In  particular,  if  all  of  the  outputs  of  a 
residual  Spread  are  connected  to  inputs  of  a  Make  or  tree  of  nested  Makes  (as  they  are 
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in  Figure  3-27),  then  we  can  simplify  this  composition  by  drawing  an  edge  from  each  port 
connected  to  the  residual  Spread’s  input  to  each  port  connected  to  the  output  port  of  the 
Make  or  of  the  root  of  the  Make  tree.  We  can  simplify  compositions  involving  residual 
Makes  in  an  analogous  way. 

For  example,  the  flow  graph  in  Figure  3-27b  would  simplify  to  the  one  in  Figure  3-27c, 
which  can  be  recognized  as  an  5,  whose  rule  is  in  Figure  3-27d. 

The  main  limitation  of  this  basic  solution  is  that  it  does  not  enable  us  to  handle  a  form 
of  partial  recognition  that  we  find  crucial  in  performing  partial  program  recognition.  In 
particular,  we  would  like  to  be  able  to  recognize  aggregate  port  types  that  aggregate  only 
a  subset  of  the  parts  that  are  aggregated  by  a  port  type  used  in  the  input  flow  graph. 

For  example,  suppose  we  have  the  flow  graph  shown  in  Figure  3-28a  and  we  want  to 
recognize  an  5  in  it,  whose  rule  is  shown  in  Figure  3-28b.  (Perhaps  the  flow  graph  in  Figure 
3-28a  represents  a  program  in  which  some  cliched  operation  is  being  done  to  some  parts  (of 
type  X  and  y)  of  a  user-defined  data  structure  F,  where  these  parts  compose  a  cliched  data 
structure  P.  At  the  same  time,  the  user-defined  data  structure  might  contain  additional 
parts  (of  type  m  and  n)  that  are  keeping  track  of  some  statistics,  such  as  how  many  times 
the  parts  of  type  x  and  y  are  accessed.  The  operations  (p  and  q)  to  the  statistics-keeping 
parts  are  unfamiliar  and  need  to  be  ignored  when  partially  recognizing  the  program.) 

The  key  to  partial  recognition  of  flow  graphs  is  the  ability  to  separate  recognizable 
portions  of  a  flow  graph  from  unrecognizable  portions.  For  partial  recognition  of  a  flow 
graph  F  to  succeed,  the  recognizable  section  must  be  a  sub-flow  graph  of  F.  (Recall  the 
discussion  of  Section  3.3.1.)  The  problem  here  is  that  residual  Spreads  and  Makes  keep 
the  unrecognizable  portion  of  the  input  flow  graph  connected  to  the  recognizable  portion, 
preventing  simplification  and  recognition  of  a  sub-flow  graph  of  the  input  flow  graph. 

The  reduction  of  the  flow  graph  using  the  rule  for  A  yields  the  flow  graph  in  Figure 
3-28c.  We  cannot  simplify  the  composition  of  the  residual  Spread  (Spread-F)  with  the 
Make  (MaJta-P)  as  we  do  in  the  first  solution  because  not  all  of  the  residual  Spread’s  outputs 
are  connected  to  the  Make’s  inputs.  The  same  is  true  for  compositions  involving  residual 
Makes. 

(Note  that  if  there  are  no  aggregate  port  types  on  terminal  nodes,  there  are  no  residual 
Spreads  or  Makes.  So  this  form  of  partial  recognition  is  handled  easily  in  the  more  restricted 
formalism.) 

To  solve  this,  we  make  use  of  the  fact  that  fan-in  and  fan-out  facilitate  partial  recognition 
in  that  unrecognizable  portions  of  a  flow  graph  that  fanout  from  or  into  ports  internal  to 
recognizable  portions  can  easily  be  ignored  simply  by  not  being  included  in  the  sub-flow 
graph  matched. 

The  idea  is  to  break  up  residual  Spreads  into  two  Spreads,  one  of  whose  outputs  connect 
to  the  recognizable  portion  while  the  other’s  outputs  connect  to  the  unrecognizable  portion. 
(The  input  port  types  of  the  two  Spreads  become  some  brand  new  type.)  The  inputs  to  the 
Spreads  are  connected  to  edges  which  fanout  from  the  port(s)  of  type  Any  that  connected 
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Figure  3-28:  (a)  A  flow  graph  only  partially  recognizable  as  the  non-ternainal  5,  whose  rule 
is  in  (b).  (c)  Result  of  reduction,  (d)  Breaking  up  residual  Spreads  and  Makes  to  facilitate 
partial  recognition. 
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to  the  input  of  the  original  residual  Spread.  Residual  Makes  are  broken  up  into  two  Makes 
analogously.  Thus,  we  isolate  the  recognizable  portion  from  the  unrecognizable  portion  by 
inserting  a  fan-in  or  fan-out.  For  example,  the  sub-flow  graph  enclosed  in  a  dashed  line  in 
Figure  3-28d  can  be  recognized  as  an  5  once  the  residual  Spreads  and  Makes  are  broken 
up. 

How  a  residual  Spread  or  Make  is  to  be  broken  up  is  determined  by  which  connections 
we  are  trying  to  make  with  ports  of  type  Any.  In  other  words,  the  decomposition  is  not 
guessed.  It  is  determined  by  what  we  are  trying  to  connect  together.  It  may  be  broken  up 
in  more  than  one  way,  depending  on  how  many  subsets  of  parts  of  an  aggregate  port  type 
can  be  partially  recognized  as  distinct  aggregate  port  types. 

As  is  the  case  with  the  rest  of  the  reduction  mechanism  discussed  so  far,  this  is  aU 
simulated  in  the  state  of  the  parser.  No  graph  operations  are  actually  done.  See  Section 
3.5.2  for  more  details. 

3.5  Chart  Parsing  Flow  Graphs 

GRASPR  uses  a  new  graph  parser  which  has  evolved  from  Brotsky’s  flow  graph  parser  [15]. 
It  also  has  been  influenced  by  a  chart-based  flow  graph  parsing  algorithm  developed  by 
Lutz  [90].  See  Figure  3-29.  Brotsky’s  parsing  algorithm  generalized  Earley’s  string  parsing 
algorithm  [32]  to  flow  graphs.  Kay  [71,  72]  and  Thompson  [132,  133]  also  generalized 
Earley’s  parser  to  create  string  chart  parsing.  This  was  a  generalization  of  the  control  of 
Earley’s  algorithm  to  allow  flexibility  in  the  rule-invocation  and  search  strategies  employed. 
Lutz  then  generalized  string  chart  parsing  to  a  type  of  flow  graph  that  is  a  slightly  restricted 
form  of  the  flow  graphs  defined  in  this  report.  (Section  3.6  explains  the  difference.)  The 
flexibility  of  control  in  Lutz’s  flow  graph  chart  parsing  algorithm  has  been  adopted  by  the 
flow  graph  parser  presented  here. 

An  earlier  version  of  our  parser  (described  in  [144,  145])  was  an  extension  of  Brotsky’s 
parser  that  allowed  it  to  handle  flow  graphs  that  contain  edges  that  fan-in  or  fan-out.  It 
also  dealt  with  some  variations  due  to  structure-sharing  (in  particular,  for  parsing  flow 
graphs  in  which  the  derivations  of  two  non-terminals  overlap).  Lutz  independently  devel¬ 
oped  more  techniques  for  dealing  with  structure-sharing  variations.  These  techniques  have 
been  incorporated  into  our  parser. 

Our  formalism  further  extends  that  of  Lutz  and  our  earlier  formalism  to  include  graph 
grammars  that  encode  aggregation  information.  Our  parser  also  extends  the  class  of  flow 
graph  variations  that  are  tolerated  to  include  variations  due  to  aggregation  organization. 

The  main  characteristics  of  the  parser  are: 

•  It  deterministically  simulates  a  non-deterministic  parser. 

•  It  finds  all  possible  parses  and  keeps  track  of  all  partial  analyses. 

•  It  can  handle  ambiguous  grammars. 
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Figure  3-29:  Flow  graph  parser  evolution. 


•  It  reuses  previously  found  parses  so  that  it  can  avoid  re-doing  work  (i.e.,  it  shares 
subderivations). 

•  It  has  a  flexible  control  structure.  Its  rule  invocation  strategy  (top-down  vs.  bottom- 
up)  and  its  search  strategy  can  be  specified  as  part  of  its  inputs. 

•  The  order  in  which  parses  are  constructed  does  not  matter.  (This  is  useful  in  being 
able  to  incrementally  construct  parses  and  to  advise  the  parser  to  focus  on  certaun 
parts  of  its  search  while  postponing  others.) 

•  It  is  able  to  make  use  of  analyses  it  has  obtained  while  parsing  to  create  alternative 
views  of  the  input  graph.  This  can  in  turn  allow  more  analyses  to  be  constructed. 

•  During  reduction,  it  can  aggregate  not  only  a  set  of  right-hand  side  nodes  into  a  single 
left-hand  side  non-terminal,  but  also  an  aggregation  of  inputs  (or  outputs)  of  a  right- 
hand  side  flow  graph  into  a  single  input  (or  output)  of  a  left-hand  side  non-terminal. 


The  Basics  of  Chart  Parsing 

Chart  parsers  maintain  a  database,  called  a  chart,  of  partial  and  complete  analyses  of  the 
input.  This  is  shown  in  Figure  3-30.  The  elements  in  the  chart  are  called  items.  (In 
string  chart  parsing,  they  are  called  “edges.”  Lutz  [90]  calls  them  “patches.”)  An  item 
might  be  either  complete  or  partial.  Complete  items  represent  the  recognition  of  some 
terminal  or  non-terminal  in  the  grammar.  Partial  items  represent  a  partial  recognition  of  a 
non-terminal. 

A  complete  item  for  a  terminal  node  is  created  for  each  node  in  the  input  graph  during 
initialization.  A  complete  item  for  a  non-terminal  node  is  created  when  there  are  complete 
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Figure  3-30:  Graph  chart  parsing. 


items  for  each  of  the  constituents  of  the  right-hand  side  of  some  rule  for  the  node's  type,  and 
the  locations  of  the  constituents  satisfy  the  right-hand  side’s  edge  connection  constraints. 
Each  complete  item  keeps  track  of  the  location  in  the  input  graph  at  which  the  instance 
of  the  node  type  has  been  found.  It  also  contains  pointers  to  the  subitems  on  which  it 
depends,  as  well  as  other  information. 

Partial  items,  on  the  other  hand,  contain  information  about  how  much  of  a  rule’s  right- 
hand  side  has  been  recognized  so  far.  It  contains  a  dotted  rule,  which  specifies  the  non¬ 
terminal  being  recognized,  the  rule  used  to  recognized  it,  which  constituents  have  been 
found,  and  which  constituents  are  stiU  needed. 

Fundamental  Event 

The  most  basic  operation  of  a  chart  parser  is  to  create  new  items  by  combining  a  partial 
item  with  a  complete  one.  This  is  called  the  fundamental  event.  If  there  is  a  partial  item 
that  needs  a  non-terminal  at  a  particular  location  and  if  there  is  a  complete  item  for 
non-terminal  A  at  that  location,  then  the  partial  item  can  be  extended  with  the  complete 
item.  During  extension,  a  copy  of  the  partial  item  is  created  and  augmented.  This  results  in 
a  new  item  which  is  added  to  the  chart.  (When  a  partial  item  is  extended  with  a  complete 
one,  they  are  said  to  be  “combined.”)  Duplicate  items  are  never  added  to  the  chart.  This 
avoids  redoing  work.  (Also,  items  are  never  removed  from  the  chart.) 

In  the  string  chart  parsing  literature,  the  chart  is  described  as  a  graph.  The  nodes 
represent  locations  in  the  string  being  parsed  and  the  edges  represent  the  partial  or  complete 
recognition  of  some  terminal  or  non-terminal  between  two  locations.  In  string  chart  parsing, 
the  retrieval  of  pairs  of  edges  to  participate  in  the  fundamental  event  is  based  primarily  on 
location.  Whenever  a  partial  and  complete  edge  meet  (i.e.,  satisfy  the  adjacency  criterion), 
the  pair  becomes  a  candidate.  The  set  of  pairs  are  then  further  refined  by  an  extendibility 
criterion  (which  typically  checks  terminal  or  non- terminal  types). 

In  string  chart  parsers,  it  makes  sense  to  use  the  adjacency  criterion  as  the  first  filter  in 
retrieving  pairs  of  edges  to  be  combined.  It  only  requires  looking  up  the  edges  that  start  at 
a  particular  node  in  the  chart  (graph).  Then  the  extendibility  criterion  can  be  applied  to 
these  edges. 

However,  in  graph  parsing,  the  “edges”  '^'^ems)  are  between  sets  of  ports.  The  adjacency 
criterion  now  requires  that  the  inpu  ts  and  oucputs  of  the  completed  item  be  a  subset  of  the 
outputs  and  inputs  (respectively)  of  the  partial  one.  Since  there  can  be  many  possible  pairs 
of  items  that  satisfy  this  criterion,  we  use  part  of  the  extendibility  criterion  to  help  retrieve 
pairs  of  items  to  combine.  Additional  constraints  have  been  added  to  the  extendibility 
criterion  as  a  way  of  narrowing  down  the  search  for  analyses.  For  example,  some  of  the 
non-structural  constraints  on  attributes  have  been  incorporated  into  the  criterion.  The 
choice  of  which  constraints  to  include  depends  on  the  cost  of  checking  the  constraints  at 
this  point  in  the  parsing.  (See  Section  6.2.2.) 
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Figure  3-31:  (a)  Adding  a  complete  item  to  the  chart,  (b)  Adding  a  partial  item  to  the 
chart. 


Agenda-Based 

In  chart  parsers,  an  agenda  is  used  to  queue  up  the  items  to  be  added  to  the  chart.  Items  are 
continually  pulled  off  the  agenda  and  placed  in  the  chart.  As  an  item  is  added,  it  is  paired 
with  other  items  with  which  it  can  be  combined.  If  the  item  being  added  is  a  complete 
item,  then  it  is  paired  with  partial  items  that  need  it.  On  the  other  hand,  if  the  item  added 
is  a  partial  item,  then  it  is  paired  with  any  complete  items  for  the  non-terminals  it  needs. 
These  two  cases  are  illustrated  in  Figure  3-31. 

The  agenda  makes  it  easy  to  control  which  things  are  added  to  the  chart  and  when  they 
are  added.  This  explicit  control  can  be  used  to  enforce  a  particular  rule  invocation  strategy 
or  search  strategy. 

For  example,  we  can  make  the  parser  adopt  a  bottom-up  parsing  strategy,  as  shown  in 
Figure  3-32.  Whenever  a  complete  item  is  added  to  the  chart,  new  empty  items  can  be 
added  to  the  agenda  for  each  rule  that  needs  the  complete  item  to  get  started  (i.e.,  the  rule 
has  a  minimal  right-hand  side  node  that  is  of  the  same  type  as  the  type  derived  by  the 
complete  item).  The  new  item  is  instantiated  at  a  location  that  depends  on  the  location  of 
the  complete  item. 

Likewise,  we  can  achieve  a  top-down  parsing  algorithm.  First,  during  initialization, 
empty  items  must  be  added  for  each  rule  that  derives  a  start  type  of  the  grammar.  (An 
“empty”  item  is  a  partial  item  that  needs  complete  items  for  all  of  its  rule’s  right-hand 
side  constituents.)  For  each  such  rule,  an  empty  item  must  be  instantiated  at  each  of  the 
possible  matchings  of  the  inputs  of  the  input  graph  to  the  inputs  of  the  rule’s  left-hand  side. 
Second,  whenever  a  partial  item  is  added  to  the  chart,  a  new  empty  item  must  be  added  to 
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Figure  3-32:  A  bottom-up  rule  invocation  strategy  affects  adding  a  complete  item  to  chart. 


the  agenda  for  each  rule  that  derives  a  non-terminal  needed  by  the  partial  item.  The  new 
item  must  be  instantiated  at  a  location  that  depends  on  where  the  partial  item  needs  the 
non-terminal  constituent. 

(In  the  current  program  recognition  system,  we  use  only  a  bottom-up  strategy,  since  this 
facilitates  partial  recognition.  This  also  makes  it  easier  to  recognize  non-terminals  for  which 
there  are  rules  with  mismatching  arity  between  the  left-hand  and  right-hand  sides.  This  is 
necessary  in  handling  rules  whose  right-hand  sides  have  inputs  (reprc  enting  constants)  that 
do  not  correspond  to  left-hand  side  input  ports.  Allowing  a  right-hand  side  to  have  more 
inputs  and  outputs  than  the  left-hand  side  is  also  crucial  in  allowing  the  type  of  embedding 
relation  that  encodes  aggregation  relationships.  A  top-down  strategy  would  require  that  we 
predict  the  organization  of  aggregation  when  each  empty  item  is  first  instantiated  (before 
the  item’s  rule’s  right-hand  side  is  matched).  In  other  words,  it  requires  searching  for  the 
appropriate  sequence  of  aggregation-introduction  transformations  needed  to  recognize  the 
flow  graph,  as  discussed  in  Section  3.4.2.) 

The  way  in  which  the  agenda  is  maintained  determines  not  only  the  rule  invocation 
strategy,  but  also  the  parser’s  search  strategy.  While  we  can  control  whether  the  parsing 
algorithm  proceeds  top-down  or  bottom-up  by  controlling  what  gets  added  to  the  agenda, 
we  can  choose  a  particular  search  strategy  (e.g.,  depth-first  or  breadth-first),  simply  by 
controlling  the  order  in  which  items  are  pulled  off  of  the  agenda.  The  agenda  might  be 
maintained  as  a  first  in,  first  out  (FIFO)  queue  to  achieve  breadth-first  search,  for  example. 

The  strategy  for  maintadning  the  agenda  can  be  given  by  the  user.  It  is  one  of  the  ways 
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Figure  3-33:  Search  strategy  as  input  to  parser. 

advice  from  an  expectation-driven  component  or  a  human  user  can  be  incorporated  into 
the  code-driven  component.  See  Figure  3-33. 

The  parser  is  guaranteed  to  find  every  parse  exactly  once,  no  matter  which  rule  invoca¬ 
tion  or  search  strategy  is  used. 

Additional  Monitors 

One  final  aspect  of  the  architecture  of  the  parser  is  that  it  contains  additional  monitors  that 
watch  the  chart.  See  Figure  3-34.  These  detect  the  existence  of  cert«un  kinds  of  items  or 
collections  of  items  in  the  chart  which  can  be  used  to  generate  other  items.  In  particular, 
they  look  for  opportunities  to  view  part  of  the  input  graph  in  an  alternative  way  in  order  to 
yield  more  parses.  The  graph  is  not  explicitly  changed  to  the  alternative  view.  Instead,  new 
items  are  created  which  represent  the  alternative  views  and  these  are  added  to  the  agenda. 

An  example  of  this  is  employed  in  simulating  the  zipping  up  of  an  input  graph  as 
explained  in  Section  3.5.1,  which  describes  how  share-equivalent  flow  graphs  are  recognized. 

Selectively  Trying  Harder 

We  do  not  necessarily  want  the  parser  to  generate  all  of  the  alternative  views  of  the  in¬ 
put  graph.  So,  the  opportunities  for  generating  new  items  representing  these  views  are 
queued  on  an  agenda.  These  opportunities  can  be  selectively  pulled  from  the  agenda  and 
performed.  The  parser  can  be  given  advice  from  an  external  agent  about  how  and  when 
to  make  the  selection.  The  parser  can  be  made  to  incrementally  try  harder.  It  can  report 
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Figure  3-34:  Additional  monitors. 

easy  recognitions  early,  and  then  be  given  more  time  later  to  generate  alternative  views 
that  uncover  the  obscured  cliches.  So,  quick  results  can  be  obtained,  without  sacrificing 
completeness  in  the  long  run. 

The  parser  can  also  be  directed  to  generate  alternative  views  only  within  a  certain  area 
of  the  input  graph.  For  example,  if  no  cliches  were  found  in  a  particular  area  of  the  input 
graph,  the  parser  could  try  generating  alternative  views  in  that  area  in  case  this  would  allow 
more  cliches  to  surface. 

Asking  for  Advice 

The  monitors  might  also  detect  question-triggering  patterns  in  the  chart.  These  are  patterns 
that  indicate  that  a  particular  constraint  is  likely  to  hold.  This  is  useful  if  the  constraint  is 
costly  for  the  parser  to  check.  When  such  a  pattern  is  found,  the  recognition  system  can  ask 
whether  the  constraint  is  satisfied.  The  question  might  be  more  easily  answered  by  some 
other  source  (such  as  an  expectation-driven  component  in  a  hybrid  recognition  system). 

Now  that  the  basic  operation  of  the  chart  parser  for  flow  graphs  has  been  described, 
the  next  three  sections  give  details  of  how  the  extensions  to  the  formalism  and  st-thrus  are 
handled. 

Motivations  for  Copying  Before  Extension 

Each  time  a  partial  item  is  extendable  by  a  complete  one,  a  copy  of  the  partial  item  is 
created  and  the  copy  is  extended.  There  are  three  reasons  that  the  parser  extends  a  copy 
of  partial  item,  rather  than  the  original.  One  is  that  the  parser  is  leaving  itself  open  to 
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the  possibility  of  ambiguity.  It  might  be  possible  in  the  future  for  the  partial  item  to  be 
extended  with  another  complete  item  for  the  same  right-hand  side  node.  By  not  changing 
the  original  partial  item,  the  parser  continually  has  a  partial  item  that  can  accept  alternative 
derivations  for  its  immediately  needed  nodes. 

The  alternative  complete  item  need  not  be  a  duplicate  of  the  first.  If  both  satisfy  the 
constraints  of  the  partial  item,  with  respect  to  its  matching  so  far,  then  both  can  extend 
the  partial  item.  For  example,  the  two  complete  items  might  have  overlapping  locations, 
but  if  the  partial  item  only  constrains  the  location  that  is  shared  by  the  two  items,  both 
can  extend  the  partial  item.  So  the  parser  is  using  copying  to  deal  with  partial  ambiguity. 

The  second  reason  is  that  copying  facilitates  partial  recognition.  When  a  complete  item 
is  recognizing  a  partial  item’s  immediately  needed  node  that  is  on  the  left  fringe,  then 
extending  a  copy  of  the  partial  item  allows  the  partial  item  to  be  extended  with  a  different 
complete  item,  representing  an  instance  of  the  left-fringe  node  at  a  different  location  in  the 
input  graph.  (This  is  a  special  case  of  ambiguity.) 

A  third  reason  to  copy  before  extending  is  that  this  facilitates  incremental  analysis 
[149].  There  are  two  forms  of  incremental  analysis.  One  is  incrementally  analyzing  a  static 
input  graph.  This  is  achieved  in  chart  parsing  by  iteratively  adding  complete  items  for  each 
of  the  input  graph’s  nodes  to  the  chart.  A  depth-first  retrieval  of  items  from  the  agenda 
can  ensure  that  all  partial  analyses  of  the  input  graph  considered  so  far  are  created  before 
another  node  of  the  input  graph  is  considered  (i.e.,  the  complete  item  for  the  node  is  added 
to  the  chart). 

The  other  type  of  incremental  analysis  is  useful  to  do  when  the  input  graph  is  changing. 
(This  might  happen  when  the  recognition  system  is  being  used  to  aid  maintenance,  for 
example.)  It  involves  updating  the  results  of  a  previously  parsed  input  graph  to  account  for 
a  modification  to  the  input  graph.  This  type  of  incremental  analysis  requires  1)  creating 
analyses  of  the  new  sub-flow  graph  and  incorporating  them  into  the  existing  analyses,  and 
2)  retracting  analyses  that  depend  on  the  old  sub-flow  graph  that  has  changed.  Augmenting 
existing  analyses  based  on  the  new  information  is  another  case  of  the  first  type  of  incremental 
analysis.  Retracting  analyses  that  are  no  longer  valid  involves  first  finding  the  items  to 
retract  and  then  doing  the  retraction. 

Copying  before  extension  makes  doing  the  retraction  of  an  item  easy.  AH  partial  items 
whose  copies  were  extended  with  the  item  are  still  around,  unmodified.  They  represent 
intermediate  states  in  the  search  for  an  analysis,  before  the  complete  item  advanced  the 
search.  Retraction  of  an  item  can  be  done  by  “killing”  the  item  in  the  chart  and  each 
partial  item  it  extended,  as  well  as  their  item  tree  descendants.  The  original  partial  item 
will  remain. 

Finding  the  items  to  retract  requires  keeping  track  of  dependencies  between  the  input 
graph’s  structure  (and  attributes)  and  the  items  that  represent  recognitions  of  it.  Most  of 
this  dependency  information  is  contained  in  the  item’s  structure  in  the  form  of  links  to  sub- 
items  that  represent  its  components.  The  leaves  of  these  links  are  the  items  for  terminal 
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nodes  in  the  input  graph.  However,  more  dependency  information  must  be  maintained 
than  is  in  the  current  implementation.  If  any  edges  are  added  or  attributes  are  changed, 
constraints  might  no  longer  be  satisfied.  The  information  of  how  items  depend  on  the  nodes, 
edges,  and  attributes  of  the  input  graph  is  important  not  only  in  deciding  which  items  to 
retract,  but  also  which  previously  failing  items  or  item  combination  attempts  might  now 
be  valid.  So  this  dependency  information  is  also  relevant  in  the  incremental  addition  of 
analyses  and  the  augmentation  of  existing  analyses. 

3.5.1  Recognizing  Share-Equivalent  Flow  Graphs 

Recall  from  Section  3.4.1  that  a  recognizer  or  parser  for  a  structure-sharing  flow  graph 
grammar  may  work  by  interleaving  zipping  and  unzipping  transformation  steps  with  the 
usual  reductions  steps.  Our  chart  parser  simulates  this  introduction  in  two  ways.  First, 
unzipping  the  input  graph  is  simulated  by  allowing  sub-derivations,  in  the  form  of  sub-items, 
to  be  shared.  For  example,  suppose  we  give  the  parser  the  input  flow  graph  shown  in  Figure 
3-35a  with  the  grammar  of  Figure  3-35b.  Once  the  parser  creates  a  complete  item  for  D, 
it  is  shared  between  the  items  for  A  and  B.  Parsing  yields  the  derivation  graph  shown  in 
Figure  3-35c. 

Second,  zipping  up  the  input  graph  is  simulated  using  a  “zip-up”  monitor.  For  example, 
an  input  flow  graph  might  redundantly  contain  two  instances  of  the  same  non-terminal  A, 
where  the  inputs  and/or  the  outputs  of  the  two  instances  fan  out  from  or  into  the  same 
port(s).  (See  Figure  3-36b.)  The  right-hand  side  flow  graph  that  we  are  looking  for  might 
maximally  share  a  single  instance  of  the  non-terminal  (as  does  the  rule  for  S  in  Figure 
3-36a).  We  would  like  to  view  the  input  program  as  maximally  sharing  the  two  instances 
of  A,  so  that  the  right-hand  side  flow  graph  will  match.  This  is  done  by  generating  an 
item  for  A  that  “zips  up”  the  two  items  for  A  that  were  created.  (See  Figure  3-36c.)  The 
location  and  sub-items  of  the  new  zipped  up  item  is  the  union  of  the  locations  and  sub-items 
(respectively)  of  its  zip-up  components. 
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Also,  the  attribute  values  of  the  2ipped  up  item’s  left-hand  side  are  computed  based  on 
those  of  the  zip-up  components.  The  attribute  combination  function  associated  with  each 
attribute  held  by  the  zip-up  components’  left-hand  sides  is  used  to  compute  a  new  value 
of  the  attribute.  In  particular,  for  each  attribute  a,  associated  with  the  left-hand  side's 
non-terminal  type,  Cj’s  combination  function  is  applied  to  the  attribute  values  held  for  a, 
by  the  left-hand  sides  of  the  zip-up  components.  (The  attribute  combination  functions  may 
be  partial  functions.  If  the  function  is  not  defined  for  the  attributes  of  some  left-hand  sides 
whose  items  are  being  zipped  up,  then  the  zip-up  attempt  fails.) 

3.5.2  Recognizing  Aggregation-Equivalent  Flow  Graphs 

Following  the  discussion  of  Section  3.4.2,  this  section  describes  the  recognition  of  aggregation- 
equivalent  flow  graphs  first  for  the  restricted  formalism  in  which  no  terminal  has  an  aggre¬ 
gate  port  type  and  then  for  the  less  restrictive  formalism.  Recall  that  the  recognition 
process  for  the  restricted  formalism  included  “inserting”  Spread  and  Make  nodes  whenever 
an  isomorphic  occurrence  of  a  right-hand  side  is  reduced  to  a  left-hand  side  non-terminal 
node  with  aggregate  ports.  The  Spread  and  Make  nodes  serve  to  bundle  up  the  edges 
surrounding  the  non-terminal  node.  The  recognition  process  also  “simplified”  any  Make- 
of-Spread  composition  that  results  from  the  insertion  of  Spreads  and  Makes.  These  actions 
are  simulated  by  the  flow  graph  chart  parser. 

In  particular,  items  keep  track  of  where  the  right-hand  side  is  found,  using  a  set  of 
location  pointers,  which  indicate  which  edges  correspond  to  the  inputs  and  outputs  of  the 
right-hand  side  of  the  item’s  rule.  To  represent  the  addition  of  a  Make  or  Spread,  the 
location  pointers  are  placed  in  tuples,  which  are  nested  in  tree  structures.  The  nested 
tuples  reflect  the  organization  of  the  aggregation  of  the  edges  to  which  they  refer.  An 
element  of  the  tuple  can  be  either  another  tuple  or  a  set  of  location  pointers.  (A  set  of  more 
than  one  location  pointer  represents  fan-in  or  fan-out.)  When  items  are  combined,  their 
location  pointers  are  compared  to  see  if  they  represent  a  Make-of-Spread  that  simplifies 
correctly.  The  corresponding  parts  of  the  tuples  are  compared.  If  both  parts  are  tuples, 
they  are  compared  recursively.  If  both  are  sets,  the  sets  must  have  a  non-empty  intersection 
for  the  comparison  to  succeed.  If  one  is  a  set  and  the  other  a  tuple,  the  comparison  fails. 

For  example.  Figure  3-37a  shows  the  flow  graph  in  the  language  of  the  grammar  in 
Figure  3-25,  whose  reduction  is  shown  in  Figure  3-26.  Location  pointers  are  shown  as 
integers  annotating  the  edges  and  edge  stubs.  Figure  3-37b  shows  the  items  created  by  the 
parser  in  parsing  this  graph.  The  nested  tuple  on  the  input  in  the  item  for  D,  for  instance, 
represents  the  nested  Make  nodes  “inserted”  during  the  reduction  sequence  of  Figure  3-26. 
The  creation  of  the  complete  item  for  5  shows  the  comparison  between  the  nested  tuples 
on  the  output  of  D  and  the  input  of  E. 

Note  that  the  simulation  method  used  by  the  parser  relies  on  using  a  bottom-up  rule 
invocation  strategy.  It  compares  the  tuples  of  location  pointers  that  are  organized  based 
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Figure  3-36:  (a)  A  graph  grammar  that  maximally  shares  the  non-terminal  A.  (b)  An  input 
flow  graph  containing  two  redundant  instances  of  A.  (c)  An  alternative  view  created  by 
“zipping  up”  the  input  graph. 
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Figure  3-37:  (a)  A  flow  graph  with  location  pointers,  (b)  Items  created  during  parsing 
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on  the  recognition  of  a  rule’s  right-hand  side,  rather  than  predicting  the  organization  and 
then  verifying  it  by  trying  to  match  the  right-hand  side  at  the  predicted  location. 

We  now  consider  recognizing  flow  graphs  in  the  less  restrictive  formalism  in  which  there 
stiU  are  no  aggregate  port  types  on  terminal  nodes,  but  the  type  Any  is  a  union  type  of 
aggregate  and  non-aggregate  types.  Recognition  involves  a  special-case  simplification  of 
compositions  of  residual  Makes  (or  Spreads)  with  the  nested  Spreads  (or  Makes)  that  are 
“inserted”  during  reduction.  Recall  that  to  perform  partial  recognition,  in  which  parts  of 
an  aggregate  port  type  used  in  the  input  graph  are  ignored,  we  need  to  “break  up”  the 
residual  Spreads  (or  Makes)  so  that  recognizable  portions  of  the  flow  graph  are  separated 
from  unrecognizable  portions. 

This  is  simulated  in  the  state  of  the  parser,  using  operations  on  the  location  pointers 
of  items.  Residual  Spreads  and  Makes  are  removed  from  the  input  flow  graph.  They  are 
replaced  with  fan-out  and  fan-in,  respectively. 

(As  is  discussed  in  Section  4.2.3,  some  of  the  information  found  in  residual  Spreads  and 
Makes  is  useful  for  generating  documentation  about  which  data  structure  cliches  were  found 
in  a  program  and  how  their  parts  relate  to  user-defined  structures’  parts.  This  information 
is  placed  in  attributes  on  the  fan-out  or  fan-in  edges  that  replace  a  Spread  or  Make.) 

In  the  combination  operation,  a  nested  tuple  of  location  pointers  “inserted”  during 
reduction  of  a  rule’s  right-hand  side  may  be  compared  with  a  flat,  unordered  set  of  location 
pointers,  representing  the  fan-out  or  fan-in  edges  that  replaced  a  residual  Make  or  Spread. 
The  combination  is  valid  if  for  each  list  Lp  of  location  pointers  in  the  fringe  of  the  tree 
formed  by  the  nested  tuple,  at  least  one  location  pointer  in  Lp  is  a  member  of  the  flat  set 
of  location  pointers.  Not  all  of  the  pointers  in  the  flat  set  of  location  pointers  need  to  be 
members  of  some  list  of  location  pointers  within  the  nested  tuple. 

For  example,  the  input  flow  graph  generated  from  the  example  of  Figure  3-28  is  shown 
in  Figure  3-38.  In  creating  a  complete  item  representing  the  recognition  of  5,  the  flat  set  of 
location  pointers  representing  the  residual  Spread,  {2, 3,4,5},  is  compared  with  the  tuple 
of  location  pointers,  <  2,3  >,  representing  the  aggregation  of  types  x  and  y  into  A’s  input 
port  type  P.  (See  Figure  3-38b.)  Likewise,  the  tuple  <  6, 7  >  is  compared  with  the  flat  set 
of  pointers  {6, 7, 8,9}.  Both  comparisons  succeed. 

3.5.3  Matching  St-Thrus 

When  two  left-hand  side  ports  of  a  rule  correspond  with  each  other  in  the  embedding 
relation,  the  rule  contains  a  st-thru.  Because  st-thrus  are  part  of  the  embedding  relation 
rather  than  the  right-hand  side  flow  graph,  they  are  not  matched  in  the  same  way  as  nodes 
and  edges  of  the  right-hand  side.  They  can  possibly  match  any  edge  in  the  input  flow  graph. 

St-thrus  impose  a  global  constraint.  Suppose  a  rule  for  a  non-terminal  A  contuns  a 
st-thru  involving  ports  labeled  1  and  3  on  A,  as  in  Figure  3-39.  If  an  item  completes  for  A 
and  is  combined  with  a  partial  item,  the  complete  item  places  a  constraint  on  the  locations 
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Figure  3-38;  Simulating  the  break  up  of  residual  Spreads  and  Makes. 

of  non-terminals  that  are  connected  to  A  at  ports  1  and  3  in  the  partial  item’s  rule.  The 
constraint  requires  that  these  adjacent  non-terminals  be  located  at  endpoints  of  the  same 
edge.  The  st-thru  essentially  imposes  a  constraint  that  the  non-terminals  connected  to  A 
at  ports  1  and  3  be  connected  to  each  other.  (See  Figure  3-40.) 

St-thrus  differ  based  on  whether  or  not  they  are  structurally  constrained  and  whether 
or  not  they  are  optional.  A  st-thru  is  structurally  constrained  if  the  embedding  relation 
restricts  it  to  matching  edges  that  fan  out  (or  in)  with  edges  coming  into  (or  out  of)  an 
isomorphic  occurrence  of  a  right-hand  side.  In  other  words,  a  st-thru  is  constrained  if  one 
or  both  of  the  two  corresponding  left-hand  side  ports  also  correspond  to  some  right-hand 
side  port. 

Structurally  unconstrained  st-  thrus  are  not  restricted  in  this  way.  They  exist  when  two 
left-hand  side  ports  correspond  to  each  other  and  no  other  right-hand  side  port.  These 
types  of  st-thrus  often  arise  when  a  right-hand  side  with  Spreads  and  Makes  is  translated 
to  a  non-aggregated  right-hand  side.  If  the  output  of  a  boundary  Spread  connects  directly 
to  an  input  of  a  boundary  Make  and  neither  port  connects  any  other  ports,  a  structurally 
unconstrained  st-thru  arises. 

We  refer  to  structursdly  constrained  st-thrus  as  simply  “constrained”  st-thrus  (and  struc¬ 
turally  unconstrained  ones  as  “unconstrained”),  with  the  understanding  that  this  is  refer¬ 
ring  only  to  structural  constraints.  Most  st-thrus,  including  unconstrmned  ones,  have  non- 
structural  constraints  (in  the  form  of  attribute  conditions)  imposed  upon  them  by  their 
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P  S 

Figure  3-39:  Grammar  containing  a  rule  with  a  st-thru. 


Figure  3-40:  Constraint  on  combination  imposed  by  st-thrus. 
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rule. 

Constrained  and  unconstrained  st-thrus  are  both  matched  to  a  set  of  edges,  which  is 
then  narrowed  down,  based  on  the  context  in  which  its  rule’s  right-hand  side  is  reduced  to 
its  left-hand  side.  An  unconstrained  st-thru  initially  matches  the  set  of  all  edges,  while  the 
constrained  st-thru  matches  the  subset  of  edges  that  satisfy  the  restrictions  imposed  by  the 
embedding  relation.  These  sets  of  matching  edges  are  shrunk  as  non-structural  constraints 
are  checked  and  the  reduction  of  higher-level  non-terminals  in  the  parse  tree  occurs. 

For  example,  suppose  a  Circular  Indexed  Sequence  Insert  and  a  Circular  Indexed  Se¬ 
quence  Extract  non-terminal  were  recognized  in  the  input  graph,  as  shown  in  Figure  3-41. 
When  the  locations  of  the  Insert  and  Extract  non-terminals  are  compared  during  combina¬ 
tion,  the  location  pointer  tuples  are  compared  element-by-element.  The  First  part  of  the 
output  of  CIS  Insert  represents  an  unconstrained  st-thru  and  is  initially  matched  to  aU  edges 
(shown  pictorially  by  a  wild-card  *).  During  combination,  this  First  part  is  matched  with 
the  First  part  of  the  input  to  the  CIS  Extract  instance.  This  narrows  down  its  matching 
set  of  edges  to  those  indicated  by  location  pointers  10  and  13.  The  Size  part  of  the  CIS 
Insert  output  also  comes  straight  through  CIS  Insert’s  right-hand  side,  but  because  it  fans 
out  with  the  input  to  MOD,  it  is  constrained  to  be  matched  to  a  small  number  of  edges 
(those  indicated  by  location  pointers  5  and  6). 

Global  constraints  represented  by  the  st-thru  are  imposed  by  propagating  reductions 
in  sets  of  matching  edges  across  non-terminals  and  across  edges.  For  example,  once  the 
item  for  CIS  Extract  extends  the  partial  item  of  Figure  3-41,  the  wild-card  matches  can  be 
reduced  to  a  small  set  of  matches.  Figure  3-42  shows  the  result  of  propagation  of  st-thru 
match  reduction.  Now  CIS  Extract’s  output  constrains  the  location  of  its  Last  part  (to 
location  9),  restricting  the  location  at  which  the  second  CIS  Insert  should  be  found. 

Constrained  and  unconstrained  st-thrus  can  additionally  be  described  as  either  optional 
or  required.  Required  st-thrus  must  be  assigned  a  match,  while  optional  st-thrus  need  not. 

Optional  st-thrus  are  useful  in  the  program  recognition  domain,  where  it  is  often  the 
case  that  there  is  no  edge  matching  a  st-thru.  This  occurs  if  no  operation  makes  use  of  the 
data  represented  by  the  st-thru.  For  example,  the  edge  indicated  by  the  location  pointer  18 
in  Figure  3-41  might  not  exist  if  nc  operation  following  the  CIS  Extract  uses  the  Base  part 
of  the  output  CIS.  St-thrus  representing  data  structure  parts  are  optional.  An  example  of 
a  required  st-thru  is  that  of  the  rule  representing  the  Negate-if-Negative  implementation  of 
the  Absolute  Value  cliche.  (See  Figure  3-9.) 

The  only  difference  this  designation  makes  is  in  what  it  means  if  the  reduction  of  sets  of 
matching  edges  results  in  an  empty  set  of  possible  matches.  If  the  st-thru  is  required,  this 
empty  set  means  the  recognition  of  the  rule’s  left-hand  side  failed.  Otherwise,  the  set  of 
possible  matches  of  an  optional  st-thru  can  become  empty  without  causing  the  recognition 
to  fail. 
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Partial  item: 


Figure  3-41:  Constrained  and  unconstrained  st-thrus. 
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Resulting  partial  item.  Notice  that  the  location  pointers  have  been  propagated  to 
replace  the  wild-cards. 


Figure  3-42:  Propagating  matches  of  st-thrus. 


3.6  Related  Graph  Grammar  Work 


Graph  grammars  have  been  used  widely  in  automatic  circuit  understanding  and  verification, 
pattern  analysis,  compiler  technology,  and  in  software  development  environments.  (See 
[34,  35,  134]  for  several  examples  in  these  areas.) 

There  are  many  varieties  of  graph  grammar  formalisms.  They  vary  both  in  the  classes 
of  graphs  that  are  generated  and  by  the  embedding  mechanisms  used.  In  this  section,  we 
briefly  discuss  the  classes  of  graphs  commonly  studied  and  relate  our  flow  graphs  to  them. 
Then  we  discuss  typical  embedding  mechanisms.  Finally,  we  describe  interesting  graph 
parsers  related  to  ours. 

3.6.1  Classes  of  Graphs 

Early  graph  grammar  work  focused  on  traditional  graphs,  in  which  nodes  do  not  have 
distinct  entry  and  exit  points  (“ports”).  This  includes  work  on  webs  and  web  grammars 
[27,  94,  102,  105,  119].  These  traditional  types  of  graphs  are  also  generated  by  node-label 
controlled  (NLC)  graph  grammars  [120]  and  by  the  algebraic  rewriting  approaches  [23,  33]. 
(NLC  grammars  are  controlled  by  node  labels  (i.e.,  our  nod^  ^es)  in  that  labels  are 
important  in  choosing  a  node  to  rewrite  and  in  that  the  Cinbedding  relation  is  defined  in 
terms  of  labels,  rather  than  specific  nodes  in  a  rale’s  right-hand  side  or  in  the  host  graph. 
Edge-label  controlled  graph  grammars  [52  92]  are  closely  related  in  that  they  can  simulate 
NLC  grammars.)  NLC  grammars  and  algebraic  rewriting  is  discussed  further  in  Section 
3.6.2.  Their  relation  to  each  other  is  studied  by  Kreowski  and  Rozenberg  in  [80]. 

Traditional  graphs  are  a  special  case  of  graph  classes  in  which  nodes  have  ports.  These 
more  general  graph  classes  include  Lutz’s  flowgraphs  [90]  and  hypergraphs  [53],  as  well  as 
our  flow  graphs. 

Lutz’s  [90]  “flowgraphs”  are  a  special  type  of  our  flow  graph.  They  contain,  in  addition 
to  nodes,  ports,  and  edges,  tie-points  which  are  intermediate  points  through  which  ports 
are  connected  to  each  other.  Since  each  port  is  connected  to  exactly  one  tie-point,  fan-in 
and  fan-out  are  not  captured  to  the  same  level  of  granularity  as  is  captured  by  flow  graphs. 
For  example,  they  cannot  express  the  following  situation:  an  output  port  Pi  fans  out  to 
input  ports  pa  and  p4,  while  output  port  p2  is  only  connected  to  p4. 

Hypergraphs  can  be  seen  as  flowgraphs  (in  Lutz’s  sense),  where  nodes  in  a  hypergraph 
correspond  to  tie-points  and  hyperedges  correspond  to  flowgraph  nodes.  Engelfriet  and 
Rozenberg  [36]  and  Vogler  [136]  study  the  relationships  between  hypergraph  grammars  and 
boundary  NLC  graph  grammars.  (In  boundary  NLC  grammars,  no  two  non-terminal  nodes 
are  neighbors  in  any  right-hand  side  [121].) 


119 


3.6.2  Embedding  Mechanism 

Our  btisic  flow  graph  formalism  makes  use  of  a  simple  embedding  relation  to  specify  the 
connectivity  of  the  right-hand  side  with  the  host  graph  when  a  left-hand  side  is  expanded 
during  derivation.  This  type  of  embedding  mechanism  is  quite  common.  However,  in  some 
formalisms,  embedding  is  more  complicated. 

In  NLC  rewriting,  the  connectivity  of  the  right-hand  side  nodes  with  the  nodes  in  the 
“embedding  area”  (i.e.,  those  nodes  adjacent  to  the  left-hand  side  node  being  expanded)  is 
determined  by  a  connection  relation  on  node  labels  (types).  In  particular,  a  right-hand  side 
node  is  connected  to  a  node  in  the  embedding  area  if  their  node  labels  are  related  by  the 
connection  relation.  (For  example,  if  label  li  is  related  to  label  /2  aU  right-hand  side  nodes 
having  label  Ij  become  connected  to  all  nodes  of  label  I2  in  the  embedding  area.) 

In  set-theoretic  approaches  [96],  the  embedding  can  involve  nodes  that  are  not  in  the 
immediate  neighborhood  of  the  left-hand  side  being  replaced.  The  nodes  to  which  the 
right-hand  side  nodes  are  connected  are  specified  by  path  expressions,  such  as  “all  nodes 
that  can  be  reached  from  the  left-hand  side  node  by  following  an  outgoing  edge  of  label  k 
and  then  an  incoming  edge  of  label  t.”  These  complicated  embedding  transformations  are 
used  mainly  in  graph  generation  (e.g.,  for  specification  purposes  in  software  development 
environments  [98,  97]). 

Part  of  each  production  in  the  algebraic  approach  [38]  is  a  set  of  gluing  points,  which 
can  be  edges  as  well  as  nodes.  Both  the  left-  and  right-hand  sides  of  the  productions  can 
be  graphs  containing  more  than  one  node.  The  gluing  points  are  two  sets  of  nodes  and/or 
edges,  one  for  each  side  of  the  production.  These  sets  are  in  bijective  correspondence  with 
each  other.  They  remain  when  the  left-hand  side  is  removed  and  form  an  anchor  for  the 
right-hand  side  that  replaces  it.  In  other  words,  the  embe'^.ding  relation  is  captured  in  the 
sets  of  corresponding  gluing  points. 

3.6.3  Graph  Parsers 

Work  on  applications  of  graph  grammars  has  focused  mostly  on  graph  generation,  rather 
than  analysis.  However,  recently  there  has  been  more  interest  in  developing  graph  parsers. 

Bamji  [8,  9]  developed  a  special  case  of  a  chart  parser  for  graphs  equivalent  to  Lutz’s  flow 
graphs.  The  interesting  aspect  of  Bamji’s  graph  grammar  formalism  is  that  his  grammar 
rules  have  an  embedding  relation  in  which  each  left-hand  side  port  can  be  related  to  a  set 
of  right-hand  side  ports.  Unlike  tuples  in  our  embedding,  these  sets  are  not  ordered  and 
the  right-hand  side  ports  aggregated  in  them  are  homogeneous  in  that  they  have  the  same 
type  and  are  not  distinguished  by  position  in  the  set.  The  chart  parser  imposes  simple 
set-intersection  conditions  between  the  port  sets  of  adjacent  non-terminals  in  right-hand 
sides  of  rules. 

Bamji  developed  this  formalism  for  the  purposes  of  representing  and  verifying  circuit 
designs.  His  parser’s  efficiency  is  gained  by  using  only  deterministic  grammars  and  using 
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a  straightforward  rewriting:  whenever  a  right-hand  side  matches  a  subgraph,  replace  it 
(destructively)  with  the  left-hand  side.  Bamji’s  parser  does  not  try  to  obtain  ail  possible 
parses,  just  one  is  sufficient  for  verification. 

Franck  [44]  and  Kaul  [69,  70]  study  precedence  graph  grammars.  They  both  present  a 
precedence  graph  parser  which  is  a  straightforward  extension  of  string  precedence  parsing 
using  the  well-known  Wirth- Weber  precedence  relations.  Graphs  can  be  parsed  in  linear 
time  with  these  parsers.  However,  precedence  graph  grammars  are  restricted  to  be  unam¬ 
biguous,  and  uniquely  invertible.  Precedence  techniques  may  be  useful  to  use  on  subsets  of 
our  graph  grammar  that  have  these  properties. 

Bunke  and  HaUer  [18]  and  Peng,  et  al.  [103]  have  both  developed  a  parser  for  plex 
grammars  which  are  generalizations  of  Earley’s  algorithm  similar  to  Brotsky’s. 

Wittenburg,  et  al.  [150]  give  a  unification-based,  bottom-up  chart  parser  which  is  similar 
to  Lutz’s  and  our  chart  parser.  Grammar  rules  place  a  strict  (total)  ordering  on  the  nodes 
in  their  right-hand  sides.  This  ordering  determines  the  order  in  which  items  are  extended. 
This  creates  fewer  partial  analyses,  which  is  advantageous  in  terms  of  efficiency,  but  is  a 
drawback  in  terms  of  generating  partial  results  when  the  graph  contains  unrecognizable 
sections. 
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Chapter  4 


Applying  Parsing  to  Recognition 


Chapter  2  described  the  cliches  that  we  have  collected  in  our  library  and  Chapter  3  described 
the  basics  of  the  parsing  technique  that  we  apply  to  recognize  them  in  a  wide  range  of 
programs.  This  chapter  fills  in  the  details  of  encoding  programs  and  cliches  in  the  flow 
graph  formalism  and  of  applying  the  flow  graph  parser  to  the  partial  program  recognition 
problem.  Sections  3.3  and  3.4.2  gave  glimpses  of  how  programs  and  cliches  are  encoded  in 
the  flow  graph  formalism.  In  Section  4.1,  we  review  and  fill  in  more  details  of  this  encoding. 
Then  in  Section  4.2,  we  complete  the  picture  by  providing  details  of  GRASPR’s  architecture. 


4.1  Expressing  Programs  and  Cliches  in  the  Flow  Graph 
Formalism 

We  use  the  flow  graph  formalism  to  represent  programs  and  programming  cliches.  In  partic¬ 
ular,  flow  graphs  serve  as  graphical  abstractions  of  programs,  flow  graph  grammars  encode 
allowable  implementation  steps  between  abstract  operations  and  lower- level  operations,  and 
the  derivation  trees  resulting  from  parsing  give  the  program’s  top-down  design. 

The  flow  graph  is  used  to  represent  the  operations  of  a  program  and  the  dataflow  between 
them.  Each  non-sink  node  in  a  flow  graph  represents  a  function,  with  ports  on  the  node 
representing  distinct  inputs  and  outputs  of  the  function.  The  ports’  types  are  determined 
by  the  signature  of  the  function.  Sink  nodes  represent  conditional  tests.  The  edges  of  a  flow 
graph  represent  dataflow  constraints  between  the  functions  and  tests.  When  the  result  of 
a  function  is  consumed  by  more  than  one  function,  the  edges  representing  the  dataflow  fan 
out.  Edges  that  fan  in  represent  the  conditional  merging  of  more  than  one  dataflow.  For 
example.  Figure  3-8  shows  the  attributed  flow  graph  representation  of  the  program  RIGHTP, 
given  in  Figure  3-7. 

Information  about  a  program’s  control  flow,  recursion,  and  data  aggregation  is  captured 
in  the  attributes  of  the  flow  graph  representation  of  the  program.  Section  4.1.1  describes 
the  key  attributes  and  conditions  used  in  representing  programs  and  programming  cliches. 

Attributed  flow  graphs  and  grammar  rules  can  become  difficult  for  people  to  read.  For 
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presentation  purposes,  we  make  use  of  a  macro- notation,  called  the  Plan  Calculus  (developed 
by  Rich,  Shrobe,  and  Waters  [110,  114,  117,  127,  137]),  which  graphically  summarizes  some 
classes  of  attributes  and  conditions,  making  them  more  readable.  Section  4.1.2  introduces 
this  notation.  The  Plan  Calculus  is  used  b  >re  only  as  a  visual  aid;  the  primary  representation 
used  by  GRASPR  is  the  flow  graph. 

The  Plan  Calculus  aided  us  in  building  the  cliche  library.  It  formed  a  representational 
stepping  stone  between  English  descriptions  of  cliches  and  their  encoding  as  attributed 
flow  graph  grammar  rules.  It  facilitates  the  capture  of  relationships  between  cliches,  such 
as  implementation  relationships  and  temporal  abstractions.  Section  4.1.3  discusses  this 
further. 

Section  4.1.4  demonstrates  how  the  event-driven  simulation  cliche  and  the  cliches  it  is 
built  upon  are  expressed  in  the  flow  graph  formalism.  It  goes  from  the  English  description 
of  the  cliches  to  their  Plan  Calculus  rendering  and  then  to  the  flow  graph  grammar  rules 
that  GRASPR  actually  uses  to  recognize  PiSin. 

4.1.1  Attribute  Language 

Attributes  on  flow  graphs  store  control  flow,  recursion,  and  data  aggregation  information 
about  a  program.  In  particular,  each  node  has  a  control  environment  attribute  which 
specifies  when  the  operation  represented  by  the  node  is  executed,  relative  to  when  other 
operations  in  the  program  are  executed.  Nodes  in  the  same  control  environment  represent 
operations  that  are  performed  under  the  same  conditions  (so  they  are  each  performed  the 
same  number  of  times).  These  nodes  are  said  to  co-occur. 

Nodes  that  represent  conditional  tests  have  two  additional  attributes,  success-ce  and 
failure-ce.  Operations  in  the  success-ce  (resp.  failure-ce)  control  environment  are  executed 
when  the  conditional  test  succeeds  (resp.  f^s). 

Control  environments  form  a  partial  order.  A  control  environment  ce^  is  less  than  or 
equal  to  another  control  environment  ccj  (denoted  ce,  C  ccj)  iff  nodes  in  cej  are  performed 
at  least  as  many  times  as  those  in  ce,  .  For  example,  the  success-ce  of  a  node  representing  a 
conditional  test  is  less  than  or  equal  to  the  control  environment  of  the  same  node,  because 
operations  on  a  conditional  branch  are  performed  less  often  than  the  conditional  test. 

A  flow  graph  representing  a  recursive  function  F  contains  a  node  whose  type  is  F. 
This  is  called  the  recursive  node.  We  assume  our  recursive  functions  always  have  at  least 
one  exit  test  and  are  singly  recursive.  (Section  7.2.1  discusses  extensions  for  modeling 
multiple  recursion  in  the  future.)  Figure  4-2  shows  the  flow  graph  representing  the  program 
HT-Insert  given  in  Figure  4-1.  (This  is  a  simple  hash  table  program  in  which  Structure 
is  an  array  of  buckets.  Each  bucket  is  a  list  of  strings,  ordered  lexicographically.)  The 
recursive  node  is  the  one  labeled  “Splice-In-Bucket.” 

We  distinguish  three  control  environments  in  flow  graphs  representing  recursive  func¬ 
tions: 
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(delnn  HT-Insert  (ElaB«nt  Structura) 

(lata  ((Kay  (Hash  Elanant  Structura)) 

(Buckat  (araf  Structura  Kay})) 

(copy-raplaca-elt  (Splica-In-Bucket  Elament  Buckat) 

Kay 

Structvire)))) 

(dalun  Splica-In-Buckat  (Elemant  Buckat) 

(if  (null  Buckat) 

(cons  Elamant  Buckat) 

(let  ((Entry  (car  Bucket))) 

(if  (string>  Entry  Elamant) 

(cons  Elamant  Buckat) 

(let  ((Rest  (cdr  Bucket))) 

(if  (string=  Entry  Elament) 

(cons  Elemant  Rest) 

(cons  Entry  (Splica-In-Bucket  Elemant  Rest)))))))) 


Figure  4-1:  A  recursive  function  with  multiple  exits. 

a  recur-ce  -  the  top-most  control  environment  of  the  flow  graph  representing  the  recur¬ 
sive  function.  It  is  the  control  environment  of  the  node  representing  the  first  operation 
performed  by  the  recursive  function.  In  Figure  4-2,  this  is  ce2. 

a  feedback-ce  -  the  control  environment  of  the  node  representing  the  recursive  call  within 
the  body  of  the  recursive  function.  In  Figure  4-2,  this  is  cc8. 

a  outside-ce  -  the  control  environment  in  which  the  recursive  function  is  called  and 
into  which  it  exits.  In  Figure  4-2,  it  is  cel.  (If  the  recursive  function  is  analyzed 
independent  of  any  callers,  a  new  control  environment  is  created  to  be  the  outside- 
ce.) 

The  feedback-ce  and  the  outside-ce  are  always  C  the  recur-ce.  Operations  performed 
before  the  exit  test  (i.e.,  in  the  recur-ce)  are  always  performed  more  times  than  the  recursive 
call  or  the  operations  done  upon  exit,  since  they  are  performed  when  the  recursion  exits 
as  well  as  when  it  repeats.  If  there  is  only  one  exit,  then  the  node  representing  the  exit 
test  has  the  recur-ce  as  its  control  environment,  the  feedback-ce  as  its  failure-ce,  and  the 
outside-ce  as  its  success-ce.  (If  a  new  control  environment  had  been  created  to  represent 
the  outside-ce,  then  it  becomes  equal  to  the  success-ce  of  the  test.) 

Summing  Incomparable  Control  Environments 

Some  subsets  of  control  environments  are  said  to  be  incomparable.  In  particular,  if  cca  and 
ccj,  are  the  success-ce  and  f«dlure-ce  of  the  same  node,  then  the  set  {cco,  ce^}  is  incomparable. 
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Element  Structure 


Figure  4-2:  Flow  graph  representing  HT-In«*rt. 
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In  addition,  the  set  of  control  environments  in  which  a  recursion  is  exited  are  incomparable. 
(There  wiU  be  more  than  one  such  control  environment  if  the  recursion  has  multiple  exits.) 
These  are  the  set  of  control  environments  of  the  nodes  that  are  executed  in  the  base  cases 
of  the  recursion.  For  example,  in  Figure  4-2,  the  set  {ce3,ce5,  ce7}  is  incomparable. 

We  define  a  partial  function  -bee  the  following.  If  a  set  S  of  control  environments 
is  not  incomparable,  then  -|-ce(*5')  is  undefined.  Otherwise,  if  5  is  a  success-ce/failure-ce 
pair  for  the  same  node,  then  -fce(-S')  is  the  control  environment  of  that  node.  If  5  is  a 
set  of  control  environments  in  which  a  recursion  is  exited,  then  -f- Cel'S")  is  the  outside-ce  of 
that  recursion.  In  Figure  4-2,  -l-ce{ce3,ce5,ce7}  =  cel,  while  -|-ce{ce3,ce5}  is  undefined. 
(Intuitively,  the  result  of  -bee  can  be  viewed  as  the  control  environment  in  which  operations 
are  performed  as  many  times  as  the  combined  number  of  times  operations  in  the  control 
environments  of  the  incomparable  set  are  performed.) 

Another  function  JTce  control  environments  is  defined  recursively  in  terms  of 

-beC  as: 

•  If  |5|  =  2,  then  ^  =  +ce(5). 

•  If  there  is  a  set  S'  C  S  which  is  incomparable,  then  S  =  52ce(+ce'S’'  U  (5  -  S')). 

•  Otherwise,  ^Zee  ^  undefined. 

In  other  words,  if  a  single  control  environment  can  be  obtained  by  recursively  reducing 
(using  -bee)  all  incomparable  subsets  of  the  input  set  5,  then  that  control  environment  is  the 
result.  Otherwise,  JZee  ^  undefined.  For  example,  in  Figure  4-2,  IZce{c^3,  ce5,  ce7,  ce8}  = 
53ce{cc3,ce5,ce6}  =  IZee{<^®3,ce4}  =  ce2.  Also,  XZce{‘^^3,ce5,ce8}  =  undefined,  while 
5Zgg{ce3,ce5,ce7}  =  ccl. 

This  summing  function  is  used  as  the  attribute  combination  function  for  control  en¬ 
vironment  attributes.  Recall  from  Section  3.5.1  that  when  two  items  are  zipped  up,  the 
attribute  values  of  the  resulting  item’s  left-hand  side  are  computed  based  on  those  of  the 
zip-up  components.  Each  attribute  has  an  attribute  combination  function  associated  with 
it.  This  is  used  to  compute  a  new  value  of  an  attribute,  based  on  the  values  of  that  attribute 
held  by  the  zip-up  components’  left-hand  sides.  For  adl  control  environment  attributes,  the 
attribute  combination  function  is  J2ce-  This  is  a  partial  function.  If  the  sum  is  not  defined 
for  the  set  of  control  environments  being  combined,  the  zip-up  of  the  items  involved  fails. 

Partial  Order  Graph  of  Control  Environments 

We  represent  the  partial  ordering  of  control  environments  in  an  annotated  partial  order 
graph  which  facilitates  the  operations  of  checking  C  and  computing  -bee  and  JZee-  The 
annotated  partial  order  graph  has  nodes  representing  control  environments.  An  edge  is 
drawn  from  one  node  representing  ce,  to  another  representing  ccj  iff  ccj  C  ccj.  This  edge  is 
annotated  with  the  set  of  control  environments  that  together  with  the  source  cci  form  an 
incomparable  set. 
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Recursion  information:  [recur-ce:  ce2,feedback-ce:  ceS,  oulside-ce:  cel  J 

Figure  4-3:  Annotated  partial  order  graph  representing  the  relationships  between  the  control 
environments  of  HT-Ins«rt. 

Associated  with  this  graph  is  a  set  of  triples,  one  for  each  recursive  function  call  rep¬ 
resented  by  the  flow  graph.  (There  may  be  more  than  one  if  the  flow  graph  represents  a 
program  that  calls  more  than  one  recursive  function,  including  nested  recursions.)  Each 
triple  contains  the  recur-ce,  feedback-ce,  and  outside-ce  of  the  flow  graph  representing  the 
recursive  function. 

For  example.  Figure  4-3  shows  the  annotated  partial  order  graph  for  the  control  envi¬ 
ronments  of  the  flow  graph  in  Figure  4-2.  One  triple  of  recursion  information  is  associated 
with  the  graph. 

Edge  Attributes 

Besides  attaching  control  environment  attributes  to  nodes,  control  flow  information  is  con¬ 
tained  in  attributes  on  edges.  Each  edge  holds  a  ce-from  attribute,  which  indicates  the 
control  environment  in  which  the  edge  carries  dataflow.  For  example,  in  Figure  4-2,  the  ce- 
from  attribute  on  the  edge  from  the  top-most  cons  (in  the  figure)  to  the  copy-rsplacs-slt 
indicates  that  the  operation  copy-rsplacs-slt  receives  dataflow  only  in  the  control  environ¬ 
ment  ce3  which  is  the  success-ce  of  the  first  null-tost  node.  (Edges  that  fan  in  represent 
conditional  merging  of  dataflow.) 

Each  edge  also  carries  a  constant-type  attribute  whose  value  is  either  a  constant  (such  as 
T,  MIL,  0)  or  imdof  ined,  depending  on  whether  the  edge  represents  dataflow  from  a  constant. 

Flow  graphs  for  programs  containing  user-defined  aggregate  data  structures  hold  at¬ 
tributes  that  represent  the  aggregation  information.  Each  edge  holds  an  accessor  attribute 
that  describes  how  the  data  it  carries  results  from  the  destructuring  of  some  data  struc- 
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ture.  Each  edge  also  holds  a  constructor  attribute  that  describes  how  the  data  it  carries 
becomes  part  of  some  data  structure.  (The  value  of  these  attributes  is  undefined  if  the 
edge  is  not  carrying  data  involved  in  some  aggregation.)  The  attributed  flow  graph  can 
be  seen  as  the  flow  graph  that  results  from  1)  making  a  flow  graph  that  includes  Spreads 
and  Makes  to  represent  aggregation  and  then  2)  transforming  it  into  a  minimally  aggre¬ 
gated  flow  graph  using  aggregation-removal  transformations,  and  3)  replacing  any  residual 
Spreads  and  Makes  with  fan-out  and  fan-in  edges,  respectively. 

As  these  nodes  are  removed,  the  naming  information  they  contain  is  placed  into  at¬ 
tributes.  This  information  is  useful  in  presenting  the  results  of  recognition  and  can  be  a 
source  of  guidance  for  the  recognition  system,  as  discussed  in  Section  4.2.3,  6.4.1,  and  7.2.3. 
Because  these  attributes  are  primarily  used  by  the  Paraphraser,  we  defer  describing  them 
until  Section  4.2.3. 

Input  and  Output  Correspondences 

In  addition  to  control  environment  attributes,  flow  graphs  for  recursive  functions  have  at¬ 
tributes  which  represent  the  relationship  between  the  inputs  (resp.  outputs)  of  the  flow 
graph  and  the  inputs  (resp.  outputs)  of  the  node  representing  the  recursive  call.  In  par¬ 
ticular,  an  output  port  po  input-corresponds  to  an  input  port  pi  iff  po  is  connected  to  the 
jth  input  of  the  recursive  node  and  pi  represents  an  input  to  an  operation  that  receives 
dataflow  from  the  jth  input  of  the  recursive  function.^  Similarly,  an  input  port  p,  output- 
corrtsponds  to  an  output  port  po  iff  pi  is  connected  to  the  fcth  output  of  the  recursive  node 
and  Po  represents  an  output  that  sends  dataflow  to  the  A:th  output  of  the  recursive  function.) 
The  input-corresponds  and  output-corresponds  relations  are  not  symmetric,  transitive,  or 
reflexive. 

For  example,  in  the  flow  graph  representing  HT-Insert,  shown  in  Figure  4-2,  the  output 
port  on  the  cdr  node  input-corresponds  with  each  of  the  input  ports  of  null-test,  ceur,  cdr, 
and  the  second  input  of  each  of  the  cone’s  in  control  environments  ce3  and  ce5.  (Input 
and  output  correspondences  are  illustrated  by  subscripted  asterisks  and  stars,  respectively.) 
The  second  input  of  the  cons  in  the  feedback-ce  output-corresponds  with  the  output  port 
of  each  of  the  cons  nodes. 

Because  recursions  can  be  nested  within  each  other,  it  is  necessary  to  be  more  specific 
about  the  conditions  under  which  a  pair  of  ports  input-  or  output-correspond  (i.e.,  in 
which  recursion  does  the  correspondence  occur).  This  is  done  by  associating  with  each 
correspondence  relation  the  feedback-ce  of  the  recursion  in  which  the  ports  correspond.  All 
correspondences  in  this  flow  graph  have  the  feedback-ce  ce8  associated  with  them. 

'The  input-corresponds  relation  was  previously  called  feed$-back  [\4b]  in  flow  graphs  representing  tail- 
recursive  functions,  but  it  was  renamed  in  the  current  representation  which  is  generalized  to  represent  regular 
recursion,  as  well  as  tail  recursion. 
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{failure-ce  (n>  null-test))) 
3.  {ce=  {ce-from  (st-thru>  1  2)) 


(success-ce  (n>  null-test))) 

Attribute-Transfer  Rules: 

1.  ce  :=  (ce  (n>  null-test)) 

Figure  4-4:  Flow  graph  grammar  rule  for  Negate-if-Negative,  with  actual  attribute  condi¬ 
tions. 

Attribute  Conditions  and  '^ansfer  Rules 

Graph  grammar  rules  iiTipose  constraints  on  the  attributes  of  the  flow  graphs  to  which  their 
right-hand  sides  match.  The  attribute  conditions  and  attribute-transfer  rules  are  expressed 
in  terms  of: 

•  Functions  that  map  a  port,  node,  or  edge  in  a  rule’s  right-hand  side  or  a  rule’s  st- 
thru  to  the  port,  node,  or  edge  in  the  input  graph  to  which  it  is  matched  when  the 
right-hand-side  (and  st-thru)  are  recognized.  These  are  p>,  n>,  e>,  and  st-thru>. 

•  Attribute  accessor  functions  which  when  given  a  node  or  edge  return  the  value  of  that 
attribute  of  the  node  or  edge.  For  example,  ce-lrom  computes  the  ce-from  attribute 
value  of  an  edge.  These  accessor  functions  are  both  primitive  accessor  retrieval  func¬ 
tions  and  functions  built  on  top  of  them,  such  as  control  environment  computations 
involving  +ce- 

•  Relations  on  the  attribute  values,  such  as  C,  and  predicates  on  nodes  and  edges  that 
are  deflned  in  terms  of  these  primitive  relations  and  the  attribute  accessor  functions. 
For  example,  co-occur  is  a  predicate  that  takes  two  nodes  and  checks  whether  their 
control  environments  are  equal. 

For  example.  Figure  4-4  gives  the  rule  for  Negate-if-Negative,  a  common  implementation 
of  the  Absolute- Value  cliche.  (This  rule  is  repeated  from  Figure  3-9,  where  the  attribute 
conditions  were  given  informally.)  In  the  first  condition,  (p>  <  2)  refers  to  the  input  graph 
port  matching  the  port  labeled  2  on  <.  Sourco?  tests  whether  this  port  receives  dataflow 
from  a  constant  equal  to  0. 
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Attribute  Conditions: 

1.  (input -corresponds?  (p>  1+  2)  (p>  1+  1) 

(feedback-ce  (innermost-recur  (n>  1+)))) 

2.  (ce=  (ce-from  (st-thru  1  2)) 

(recur-ce  (innermost-recur  (n>  1+)))) 

Attribute-Transfer  Rules: 

1.  ce  :=  (ce  (n>  1+) ) 

Figure  4-5:  Grammar  rule  for  counting-up  cliche. 

In  the  second  condition,  e>  is  used  to  refer  to  an  edge  in  the  input  graph  whose  source 
matches  an  output  of  the  rule’s  right-hand  side.  It  constrains  this  edge  to  have  a  ce-from 
attribute  that  is  equal  to  the  failure-ce  of  the  node  that  matches  null-test. 

The  third  condition  uses  8t-thru>  to  refer  to  an  edge  that  matches  the  st-thru.  It 
constrains  this  edge  to  have  a  ce-from  attribute  that  is  equal  to  the  success-ce  of  the  node 
that  matches  null-test. 

The  attribute-transfer  rule  computes  the  control  environment  of  the  left-hand  side  node 
to  be  the  control  environment  of  the  node  matching  null-test. 

Attribute  accessor  functions  are  provided  to  compute  the  recursion  information  for  the 
innermost  recursion  containing  a  particular  node.  These  are  used  in  many  constraints 
for  iterative  cliches.  A  typical  constraint  is  that  two  ports  input-correspond  or  output- 
correspond  in  the  feedback-ce  of  the  innermost  recursion  containing  some  node. 

For  example,  Figure  4-5  shows  the  grammar  rule  representing  the  iteration  cliche, 
counting-up,  which  repeatedly  increments  the  value  of  its  input,  which  starts  with  some 
initial  value  and  is  subsequently  the  result  of  the  increment  performed  on  the  previous  it¬ 
eration.  The  rule  constrains  the  input  graph  ports  matching  the  output  and  input  ports 
of  1+  to  input-correspond  in  the  feedback-ce  of  the  innermost  recursion  in  which  the  input 
graph  node  matching  1+  occurs. 

4.1.2  The  Plan  Calculus 

Flow  graphs  annotated  with  the  attributes  and  conditions  described  in  the  previous  section 
can  become  difficrilt  for  people  to  read.  For  presentation  purposes,  we  make  use  of  a  graphical 
notation,  called  the  Plan  Calculus  [110,  117],  which  aids  people  in  viewing  flow  graphs  with 
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certain  classes  r>f  constraints  pertaining  to  programming.  However,  although  the  Plan 
Calculus  is  used  as  a  visual  aid,  the  underlying  attributed  flow  graph  representation  is 
conceptually  primary  to  our  recognition  approach. 

The  Plan  Calculus  is  a  graphical  formalism  for  representing  programs,  cliches,  and 
relationships  between  cliches.  In  the  Plan  Calculus,  both  cliches  and  individual  programs 
are  represented  as  plans.  The  relationships  between  cliches  are  captured  in  overlays.  This 
section  briefly  describes  plans  and  overlays  as  they  relate  to  our  attributed  flow  graph 
formalism.  (For  more  details,  see  Rich  [110,  117].) 

A  plan  graphically  represents  the  operations  of  a  program  and  the  data  and  control  flow 
constraints  between  them  in  what  is  called  a  plan  diagram.  (Plans  also  specify  preconditions 
and  postconditions  in  a  separate  logical  language.)  A  plan  diagram  is  a  hierarchical  graph 
structure  composed  of  boxes  and  arrows.  Boxes  denote  operations  and  tests,  while  arrows 
denote  control  flow  and  dataflow. 

Plan  diagrams  can  be  seen  as  graphical  depictions  of  flow  graphs  with  certain  classes 
of  attributes  and  conditions  -  those  that  pertain  to  control  flow  and  data  aggregation. 
Plan  diagrams  and  flow  graphs  share  the  same  dataflow  structure  in  that  boxes  represent 
operations  and  arcs  denote  dataflow  between  them.  However,  plan  diagrams  also  have  arcs 
that  denote  control  flow  and  join  boxes  that  represent  the  merging  of  control  flow.  A  control 
flow  arc  from  a  box  A  to  a  box  B  denotes  that  B  eventually  (not  necessarily  immediately) 
follows  A.  A  branch  in  control  flow  is  represented  by  a  test  box.  The  rejoining  of  control 
flow  is  represented  by  a  join  box.  It  has  two  sets  of  incoming  dataflow  arcs,  one  for  each  case 
of  the  corresponding  test  that  caused  the  control  flow  to  branch  out.  The  set  of  dataflow 
arcs  leaving  the  join  carry  the  data  of  the  set  of  inputs  on  either  the  T  or  the  F  side  of  the 
join,  depending  on  whether  the  T  or  the  F  branch  (respectively)  of  the  conditional  is  taken. 

Like  flow  graph  edges,  dataflow  arcs  may  fan  out  (which  means  the  result  of  an  operation 
is  used  by  more  than  one  operation).  However,  they  cannot  fan  into  the  same  input,  as 
edges  can  in  flow  graphs.  Instead,  they  are  merged  by  join  boxes.  Control  flow  arcs  may 
fan  in  or  out. 

Figure  4-6  shows  an  example  of  a  plan  diagram,  representing  the  foDowing  code  fragment. 

(let  ((tax  0.0)) 

(vhen  (>  gross  min) 

(setq  tax  (*  percent  gross))) 

(-  gross  tax)) 

Solid  arcs  denote  dataflow;  cross-hatched  arcs  denote  control  flow.  Each  box  in  the  plan 
has  a  label,  composed  of  a  part  name  and  a  type.  For  instance,  the  label  “multiply:*” 
specifies  that  the  plan  in  Figure  4-6  has  a  part  named  “multiply”  of  type  “*.”  The  part 
names  serve  to  distinguish  between  boxes  in  the  plan  that  have  the  same  type.  The  part 
names  in  a  given  plan  diagram  must  be  distinct.  The  part  “test”  is  a  test  box.  Although  in 
this  example,  “test”  has  no  data  outputs,  in  general,  data  may  flow  out  of  a  test  box  from 
either  the  side  labeled  T  or  the  side  labeled  F,  depending  on  whether  the  output  is  produced 
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Figure  4-6:  The  plan  diagram  for  a  code  fragment. 

when  the  test  succeeds  or  fails,  respectively.  The  box  named  “end”  is  a  join.  Its  outgoing 
dataflow  arc  carries  the  data  coming  from  “multiply”  when  GR0SS>MIH  (and  the  F  branch 
of  “test”  is  executed),  and  0.0,  otherwise. 

The  control  flow  arcs,  test,  and  join  boxes  represent  the  control  flow  information  that  is 
in  the  control  environment  attributes.  Boxes  that  represent  operations  that  are  tied  together 
by  control  flow  arcs  correspond  to  nodes  that  are  all  in  the  same  control  environment  in  our 
flow  graphs.  The  relationships  between  control  environments  are  reflected  in  the  structure 
of  the  control  flow  arcs.  The  ce-from  attributes  and  conditions  on  dataflow  edges  are 
represented  by  dataflow  routed  through  joins,  which  explicitly  specify  in  which  case  of  a 
conditional  branch  data  flows  from  a  particular  operation  to  another. 

Control  flow  arcs  are  sometimes  omitted  when  there  is  no  conditional  structure  (i.e.,  all 
operations  are  in  the  same  control  environment).  For  example,  in  Figure  4-6,  the  control 
flow  arcs  between  “compare”  and  “test”  and  between  “end”  and  “subtract”  can  be  omitted. 

Plans  may  contain  other  plans  as  parts.  If  the  type  of  a  plan  and  a  subplan  within  it 
are  the  same,  then  the  plan  is  recursively  defined.  An  example  is  given  in  Figure  4-7.  This 
is  the  plan  diagram  representing  the  following  code  fragment  which  iterates  over  a  list  L, 
counting  the  number  of  elements  in  it.  A  dashed  box  delimits  the  recursive  subplan,  with 
enough  details  filled  in  to  show  the  input-  and  output-corresponds  relations. 

(LET  ((COUIT  0)) 

(LOOP  (WHEl  (lULL  L)  (RETURI  COURT)) 

(SETQ  L  (CDR  L)) 

(SETQ  COUIT  (1+  COURT)))) 
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Figure  4-7:  A  recursively  defined  plan. 


circular-indexed-sequence 

Figure  4-8:  Data  plan  for  Circular  Indexed  Sequence. 
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New:  Circular~Indexed‘Seque¥u:e 


CiS‘ Extract 

Figure  4-9:  Plan  for  extracting  an  element  from  a  Circular  Indexed  Sequence. 

Plan  diagrams  can  contain  data  as  parts.  A  data  plan  is  a  plan  whose  parts  are  all 
either  data  or  (hierarchically)  data  plans.  For  example,  Figure  4-8  shows  a  data  plan 
diagram  representing  the  Circular  Indexed  Sequence  (CIS)  data  structure.  Figure  4-9  shows 
a  hierarchical  plan  that  contains  both  data  and  computational  parts.  It  is  the  plan  diagram 
for  the  familiar  computation  of  extracting  an  element  from  a  CIS.  The  two  data  subplans, 
which  represent  the  aggregation  of  data,  depict  the  accessor  and  constructor  information 
that  we  encode  in  accessor  and  constructor  edge  attributes  on  flow  graphs. 

4.1.3  Codifying  Cliches:  Using  the  Plan  Calculus  as  a  Stepping  Stone 

Plans  are  used  in  the  Plan  Calculus  both  to  represent  programs  and  to  define  cbches. 
Relationships  between  cliches  are  represented  by  overlays.  An  overlay  is  a  pair  of  plans  and 
a  set  of  correspondences  between  their  parts.  They  show  how  an  instance  of  one  cliche  can 
be  viewed  as  an  instance  of  another.  Overlays  provide  a  general  facility  for  representing 
common  shifts  of  viewpoint,  such  as  implementing  specifications  and  data  abstractions,  and 
temporally  abstracting  iterations. 

As  grammar  writers,  we  found  it  easier  to  express  cliches  in  the  Plan  Calculus  first  and 
then  to  translate  the  plan  definitions  and  overlays  into  graph  grammar  rules. 

This  section  describes  overlays  and  shows  examples  of  how  relationships  between  cliches 
are  captured  in  them.  It  then  describes  how  overlays  and  plan  definitions  of  cliches  are 
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Old:  Circular-Indextd-Sequetice  (p 


Ciradar-lndexed-Stquence- 

FIFO 


Old:  FIFO 


CIS-Extract 


CIS-Ejaract-as-FlFO-Dequeue 

Figure  4-10:  Implementation  overlay  showing  how  FIFO-Dequeue  can  be  implemented  by 
CIS-Extract. 

encoded  in  attributed  flow  graph  grammar  rules. 

Implementation  Relationships 

Recognizing  cliches  on  multiple  levels  of  abstraction  requires  being  able  to  view  some  cliches 
as  implementations  of  more  abstract  cliches.  In  the  Plan  Calculus,  implementation  overlays 
capture  these  relationships. 

The  plan  on  the  right  of  an  implementation  overlay  is  the  plan  definition  for  an  abstract 
operation  or  data  structure.  The  plan  on  the  left  of  the  overlay  is  the  plan  definition  of  a 
correct  implementation  of  the  abstract  operation  or  data  structure  represented  on  the  right. 

For  example,  Figure  4-10  shows  an  implementation  overlay  that  expresses  the  relation¬ 
ship  between  the  abstract  cliched  operation  FIFO-Dequeue  and  one  possible  implementation 
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of  it,  which  is  as  a  CIS- Extract  cliche.  The  correspondences  between  the  two  sides  of  the 
overlay  show  how  the  inputs  and  outputs  of  the  abstract  operation  are  related  to  those 
of  the  implementation.  They  may  be  labeled  with  names  of  data  overlays,  as  is  the  cor¬ 
respondence  between  the  input  FIFO  on  the  right  and  the  input  CIS  on  the  left.  The 
CIS-Extract-as-FIFO-Dequeue  overlay  represents  an  implementation  of  the  FIFO-Dequeue 
operation,  in  which  the  FIFO  is  implemented  as  a  Circular  Indexed  Sequence.  The  old  and 
new  FIFOs  of  the  FIFO-Dequeue  operation  correspond  to  the  old  and  new  Circular  Indexed 
Sequences  of  the  implementation  plan.  These  correspondences  are  labeled  with  the  name 
of  the  Circular-Indexed-Sequence-as-FIFO  data  overlay,  which  means  that  the  old  (resp. 
new)  CIS  of  CIS-Extract,  when  viewed  as  a  FIFO  correspond  to  the  old  (resp.  new)  FIFO 
of  FIFO-Dequeue. 

Encoding  Implementation  Overlays  in  Grammar  Rules 

Our  grammar  formalism  was  developed  to  make  it  easy  to  represent  shifts  of  viewpoint 
from  both  abstract  operations  and  abstract  data  structures  to  their  implementations.  It  is 
specifically  able  to  encode  the  relationships  expressed  in  implementation  overlays,  including 
those  in  which  the  left-side  plan  definition  contains  data  plans  for  aggregate  data  structures 
as  subplans. 

Each  plan  definition  of  the  algorithmic  cliches  is  encoded  in  a  flow  graph  grammar  rule. 
The  type  of  the  left-hand  side  node  of  the  rule  is  the  plan’s  name.  The  right-hand  side  is 
the  flow  graph  encoding  of  the  plan,  in  which  the  control  flow  constraints  summarized  in 
the  structure  of  the  plan  are  listed  in  attribute  conditions.  If  the  inputs  or  outputs  of  the 
plan  definition  are  data  plans,  the  aggregation  they  represent  is  encoded  in  the  embedding 
relation  of  the  rule. 

In  particular,  suppose  an  input  (or  output)  of  a  plan  definition  is  an  aggregate  data 
structure  of  type  D,  represented  by  a  data  subplan.  The  rule  encoding  of  the  plan  definition 
will  have  a  left-hand  side  port  whose  type  is  D  which  corresponds  to  a  tuple  of  right-hand 
side  and  left-hand  side  ports.  For  each  part  pi  of  the  data  plan,  the  ith  element  of  the  tuple 
is  the  set  of  right-hand  side  ports  (if  any)  that  encode  the  inputs  or  outputs  of  boxes  to 
which  the  part  is  connected.  If  the  part  is  connected  directly  to  a  part  in  another  data  plan 
in  the  plan  definition,  then  the  tuple  will  include  the  left-hand  side  port  that  encodes  that 
data  plan. 

(One  way  to  see  this  encoding  is:  the  ports  in  the  tuple  are  determined  as  if  the  input 
(or  output)  data  plan  were  replaced  by  a  fringe  Spread  (or  Make)  node.  The  embedding 
relation  that  results  from  removing  these  fringe  nodes  (as  described  in  Section  3.4.2)  is  the 
same  as  the  embedding  resulting  from  this  encoding.) 

For  example,  Figure  4-11  shows  the  flow  graph  grammar  rule  encoding  of  the  CIS- 
Extract  plan  definition  of  Figure  4-9.  (This  figure  is  a  repeat  of  Figure  3-24.)  Attribute 
conditions  and  transfer  rules  are  not  shown. 
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Figure  4-11:  Rule  encoding  plan  for  CIS-Extract. 


Currently,  we  are  limited  to  encoding  only  those  plans  that  contain  data  subplans  only 
at  its  inputs  or  outputs.  However,  internal  data  subplans  can  be  represented  by  coUapsing 
a  sub-flow  graph  of  the  flow  graph  that  represents  the  left  side  of  the  overlay  into  a  non¬ 
terminal.  This  sub- flow  graph  can  have  the  data  plan  as  its  input/output. 

In  addition  to  plan  definitions  of  cliches,  each  implementation  overlay  is  encoded  as  a 
flow  graph  grammar  rule.  These  rules  contadn  single  nodes  on  both  sides.  The  left-hand 
side  node’s  type  is  the  type  of  the  abstract  operation  on  the  right  side  of  the  overlay.  The 
right-hand  side  node’s  type  is  the  name  of  the  implementation  plan  on  the  overlay’s  left 
side. 

The  embedding  relation  encodes  the  correspondences  between  the  two  sides  of  the  over¬ 
lay.  If  there  is  a  correspondence  between  an  input  (or  output)  of  the  abstract  operation  on 
the  right  side  of  the  overlay  and  an  input  (or  output)  of  the  implementation  plan,  then  the 
left-  and  right-hand  side  ports  that  encode  them  in  the  grammar  rule  correspond  to  each 
other  in  the  rule’s  embedding  relation.  For  example.  Figure  4-12  shows  the  grammar  rule 
encoding  of  the  overlay  of  Figure  4-10. 

Sometimes  a  correspondence  is  labeled  with  the  name  of  a  data  overlay  that  maps 
an  abstract  data  type  to  a  concrete  one.  This  mapping  information  is  associated  with  the 
corresponding  ports  in  the  rule.  Different  ports  may  have  different  data  mappings  associated 
with  them,  even  if  they  are  of  the  same  type. 

When  a  rule  that  encodes  an  overlay  is  used  in  a  parse,  it  uncovers  a  design  decision 
to  implement  a  certain  abstract  operation  or  data  structure  as  another  operation  or  data 
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Figure  4-12:  Rule  encoding  the  CIS-Extract-as-FIFO-Dequeue  overlay. 

structure.  The  overlay  mapping  information  is  used  to  generate  documentation  of  this 
design  decision. 

Temporal  Abstraction 

In  recognizing  an  iterative  program,  it  is  often  useful  to  view  cliched  fragments  of  itera¬ 
tive  computation  as  operations  on  a  sequence  of  values.  This  technique  is  called  temporal 
abstraction.  (See  [110,  117,  127,  138].) 

For  example,  a  common  computation  that  occurs  in  iterative  programs  is;  on  each 
iteration  a  function  is  applied  to  the  result  of  the  previous  application  of  the  function  (or  to 
an  initial  value  on  the  first  iteration).  This  is  called  the  generation  cliche.  The  plan  diagram 
for  this  iteration  cliche  is  shown  on  the  left  in  the  overlay  of  Figure  4-13.  A  common  instance 
of  generation  is  counting-up,  in  which  the  generating  function  is  1+. 

The  temporally  abstracted  view  of  generation  is  as  an  operation  Generate  that  takes  an 
initial  value  and  a  generating  function  and  creates  a  sequence  of  values  -  the  values  processed 
over  time,  one  per  iteration.  For  example,  the  temporal  abstraction  of  the  counting-up  cliche 
is  the  operation  Count,  which  takes  an  initial  value  (i)  and  produces  the  sequence  of  values 
[i,i+l,{i-\-l)-\-l,...]. 

The  temporal  abstraction  of  iteration  cliches  is  formalized  in  the  Plan  Calculus  using 
temporal  overlays.  These  relate  a  temporally  abstract  operation  (on  the  right  side  of  the 
overlay)  to  the  plan  for  an  iteration  cliche  (on  the  left  side).  Figure  4-13  shows  a  temporal 
overlay  formalizing  the  temporal  abstraction  of  generation  as  a  Generate  operation. 

The  correspondence  labeled  with  an  asterisk  is  c<illed  a  temporal  correspondence.  This 
denotes  the  relationship  between  the  left  side  data  part  (the  input  to  apply)  and  the  right 
side  temporal  sequence  (the  output  of  Generate).  It  specifies  that  the  first  term  of  the 
temporal  output  sequence  of  Generate  is  equal  to  the  initial  input  to  apply;  the  second  term 
is  equal  to  the  same  part  of  the  recursively  defined  plan;  and  so  on  recursively.  Temporal 
overlays  always  contain  at  least  one  temporal  correspondence. 

Temporal  abstraction  allows  an  iterative  program  that  is  composed  of  iteration  cliches 
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Figure  4-13:  Temporal  overlay  showing  the  view  of  Generation  as  a  Generate  operation. 


to  be  seen  as  a  composition  of  functions  on  sequences.  This  makes  the  program  as  easy  to 
understand  and  reason  about  as  a  non-iterative  (straight-line)  program. 

Temporal  abstraction  also  enables  GRiSPR  to  undo  common  function-sharing  optimiza¬ 
tions  within  iterative  programs,  such  as  loop-jamming,  using  the  same  techniques  it  uses  to 
deal  with  function- sharing  due  to  common  subexpression  elimination.  (These  are  the  tech¬ 
niques  for  parsing  structure-sharing  flow  graphs,  as  is  discussed  further  in  Section  5.1.5.) 

Also,  it  is  easy  to  encode  cliches  by  building  them  out  of  temporally  abstract  operations, 
rather  than  expressing  them  as  large,  flat  iteration  patterns.  Additionally,  a  composition 
of  abstract  operations  is  easier  to  describe  than  a  combination  of  overlapping,  interleaved 
iteration  cliches. 


Encoding  Temporal  Abstractions  in  Grammar  Rules 

As  with  implementation  relationships,  flow  graph  grammar  rules  are  able  to  capture  tem¬ 
poral  abstractions  by  a  straightforward  encoding  of  temporal  overlays. 

Like  any  other  algorithmic  cliche,  the  plan  diagram  for  an  iteration  cliche  is  encoded  in 
a  grammar  rule  whose  left-hand  side  is  a  node  whose  type  is  the  name  of  the  cliche.  The 
right-hand  side  is  the  dataflow  structure  of  the  plan  diagram. 

The  relationships  between  the  inputs  (resp.  outputs)  of  the  recursively  deflned  plan  and 
the  inputs  (resp.  outputs)  of  the  recursive  subplan  are  captured  in  “input-corresponds?” 
and  “output-corresponds?”  conditions.  For  example,  the  rule  for  generation  is  shown  in 
Figure  4-14.  It  has  attribute  conditions  that  constrain  the  output  of  f  to  input-correspond 
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Node-Type  Caosniiits: 

f;  (lambda  I  node- type)  T) 

Attribute  Cooditiau: 

1.  ( input-corresponds?  (p>  f  2)  (p>  f  11 

(feedback-ce  ( innermost-recur  (n>  f))l) 

2.  (ce=  (ce-£rom  (st-thru  1  2)) 

(recur-ce  (innermost-recur  (n>  f)))) 

Attribute-Ttaufet  Rules: 

1 .  ce  :=  (ce  (n>  f) ) 

2.  generating-function  ;=  (node-type  (n>  f)) 

Figure  4-14:  Grammar  rule  encoding  the  plan  for  Generation. 

to  the  input  of  t. 

This  rule’s  right-hand  side  is  not  exactly  the  dataflow  structure  of  generation’s  plan 
definition.  The  plan  definition  takes  a  function  input,  which  is  iteratively  applied,  but  the 
right-hand  side  flow  graph  does  not  explicitly  represent  this  functional  input  and  application. 
Instead,  the  right-hand  side  node  has  a  generalized  node  type,  which  means  the  rule  imposes 
a  constraint  on  the  types  of  input  graph  nodes  or  non-terminal  instances  that  can  match 
this  node.  In  the  rule  for  generation,  the  node  type  constraint  is  loose:  any  node  type 
matches.  So  any  instances  of  a  cliched  unary  operation  or  a  unary  primitive  operation  that 
satisfies  the  input-corresponds  relationships  will  be  recognized  as  an  instance  of  generation. 
(Gener2dized  node  types  are  used  as  a  shorthand  for  several  rules  that  have  the  same  left- 
and  right-hand  sides,  except  for  variation  in  the  node  types  of  the  right-hand  side  nodes.) 

The  reason  the  apply  operation  is  not  encoded  directly  in  the  grammar  rule  as  a  node 
of  type  “apply”  is  that  there  would  not  be  an  input  graph  node  to  match  it.  Also,  this 
grammar  rule  cannot  be  used  to  recognize  generation  in  programs  in  which  the  generating 
function  is  an  arbitrary  composition  of  functions.  This  limitation  is  discussed  in  more  detail 
in  Section  5.2.3. 

The  type  of  the  input  graph  node  matching  the  right-hand  side  is  transferred  to  the  left- 
hand  side’s  generating-function  attribute.  This  can  be  constrained  in  attribute  conditions 
of  rules  that  use  generation. 

Control  flow  constraints  captured  in  the  iteration  cliche’s  plan  are  encoded  in  attribute 
conditions  referring  to  the  control  environments  of  the  recursion  (recur-ce,  feedback-ce,  and 
outside-ce).  For  example,  the  plan  diagram  for  the  cliche  iterative-search  is  shown  on  the 
left  in  the  overlay  of  Figure  4-15.  This  iteration  cliche  is  the  familiar  pattern  of  repeatedly 
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Figure  4-15:  Temporal  overlay  relating  the  plan  for  Iterative  Search  and  the  operation 
Earliest. 

applying  some  test  until  it  is  satisfied  by  some  value.  When  the  test  succeeds,  the  iteration 
is  terminated  and  the  value  is  made  available  outside  the  iteration.  This  iteration  cliche  is 
encoded  in  the  flow  graph  grammar  rule  shown  in  Figure  4-16.  (In  the  figure,  c«<=  stands 
for  C  and  ce=  is  the  equadity  relation  between  control  environments.) 

The  first  condition  in  this  rule  encodes  the  constraint  summarized  by  the  control  flow 
arcs,  test,  and  join:  the  test  must  be  an  exit  test  of  the  iteration.  This  constraint  translates 
to  a  condition  on  how  the  control  environments  of  the  test  and  the  recursion  relate.  In 
particular,  the  recursive  call  should  occur  in  the  failure-ce  of  the  test  and  the  recursion 
should  be  exited  in  the  success-ce  of  the  test. 

The  attribute  condition  actually  loosens  this  constraint  slightly  to  allow  for  other  exit 
tests  of  the  recursion.  The  two  parts  of  the  condition  are: 

1.  It  must  be  possible  for  the  recursive  call  to  occur  in  the  failure-ce  of  the  test  (but 
another  exit  test  may  occur  in  the  failure-ce  which  can  prevent  this  from  happening). 
This  is  expressed  as:  the  feedback-ce  of  the  innermost  recursion  containing  the  test 
must  be  C  the  failure-ce  of  the  test. 

2.  The  success-ce  of  the  test  is  one  possible  way  to  exit  the  recursion  (but  there  may  be 
another  exit  test  in  whose  success-ce  the  recursion  is  also  exited).  This  is  expressed 
as  the  success-ce  must  be  C  the  outside-ce  of  the  recursion. 

This  constraint  occurs  in  the  encoding  of  many  iteration  constraints,  so  we  defined  a 
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Attribute  Coadiriaas: 
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(ce<=  (success-ce  {n>  P)) 

(outside-ce  (innermost-recur  (n>  P))))) 

2.  (ce=  (ce-from  (st-t)iru>  1  2)) 

(success-ce  (n>  P))) 

3.  (ce=  (ce-from 

(output-edge  (recursive-node  (innermost-recur  (n>  P))) 
(edge-sinic  (st-thru>  1  2)))) 

(feedlMclt-ce  (innermost-recur  (n>  P)))) 

Anribute-Traiisfer  Rules: 

1 .  ce  :=  ice  (n>  P) ) 

2.  searc)\-predicate  ;=  (node-type  (n>  P)  1 

3:  success-ce  :=  (success-ce  (n>  P)) 

4.  failure-ce  :=  (failure-ce  (n>  P)) 


Figure  4-16;  Grammar  rule  for  Iterative  Search  cliche. 


predicate,  exit-predicate,  that  takes  a  terminal  or  non-terminal  test  node  and  checks  these 
conditions.  So  the  abbreviate  form  of  the  first  condition  in  Figure  4-16  is  (exit-predicate 
(n>  P)).  For  example,  the  top- most  null-test  terminal  node  in  Figure  4-2  is  an  exit- 
predicate. 

The  second  attribute  condition  in  the  rule  for  iterative-search  constrains  the  output  to 
carry  dataflow  in  the  success-ce  of  the  test.  This  expresses  the  constraint  that  the  output 
of  the  iterative- search  cliche  is  the  first  element  to  pass  the  test. 

The  third  condition  encodes  the  constraint  that  is  depicted  by  the  data  and  control 
flow  edges  from  the  recursive  sub-plan  to  the  exit  join  in  the  plan  diagram  of  Figure  4-15. 
This  constraint  is  that  the  output  dataflow  of  the  recursion  that  merges  with  the  st-thru 
must  carry  dataflow  in  the  feedback-ce  of  the  innermost  recursion  containing  the  test.  This 
ensures  that  there  is  no  additional  computation  being  performed  on  the  way  up  out  of  the 
recursion. 

The  function  recursive-node  finds  the  input  graph  node  that  represents  the  recursive 
call  of  the  recursion  containing  the  exit  test.  The  function  output-edge  finds  the  edge  from 
some  output  port  of  a  recursive  node  to  an  input  port.  This  function  is  only  used  when  the 
recursive  node  is  expected  to  have  only  one  output  port  that  connects  to  the  input  port. 
(The  constraint  fails  if  this  is  not  true.)  In  this  case,  output-edge  finds  the  edge  that  shares 
its  sink  with  the  edge  matching  the  st-thru. 

This  rather  awkward  type  of  condition  is  imposing  a  structural  constraint  (as  well  as 
the  ce-from  constraint)  which  cannot  be  expressed  in  the  structure  of  the  rule’s  right-hand 
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Attribute-Transfer  Rules: 

1.  ce  :=  (outside-ce  { innerinosc- recur  (n>  Iterative-Search))) 

2.  search-predicate  :=  (search-predicate  (n>  Iterative-Search)) 

Figure  4-17:  Grammar  rule  encoding  the  temporal  overlay  Iterative-Search-as- Earliest. 

side  flow  graph.  It  requires  that  there  be  an  edge  from  a  recursive  node  directly  to  the 
output  that  merges  with  the  st-thru.  This  constraint  is  expressed  in  attribute  conditions, 
rather  than  in  the  structure  of  the  right-hand  side  of  the  rule  because  there  is  no  way  to 
represent  the  edge  from  the  recursive  node  to  the  output  without  including  the  recursive 
node  in  the  right-hand  side.  The  edge  cannot  be  expressed  as  a  st-thru,  since  its  source  is 
not  an  input  to  the  non-terminal.  If  we  did  include  the  recursive  node,  we  would  have  to 
specify  its  arity.  This  would  severely  restrict  the  programs  in  which  it  can  be  matched  to 
only  those  with  recursive  nodes  of  the  specified  arity. 

The  attribute-transfer  rules  shown  in  Figure  4-16  specify  that  all  of  the  control  envi¬ 
ronment  attributes  of  the  exit  predicate  are  transferred  to  the  non-terminal  representing 
iterative-search. 

A  temporal  abstraction  of  iterative-search  is  the  Far/test  operation.  This  operation  takes 
a  sequence  of  values  and  a  predicate  and  finds  the  first  term  in  the  sequence  satisfying  the 
predicate.  This  relationship  is  shown  in  the  overlay  of  Figure  4-15. 

A  temporal  overlay  is  encoded  in  a  g’-ammar  rule  in  the  same  way  as  implementation 
overlays.  Figure  4-17  shows  the  rule  for  Earliest. 

When  an  iteration  cliche  is  viewed  as  a  temporally  abstract  operation,  the  operation 
is  seen  as  being  in  the  control  environment  from  which  the  iteration  is  called  (i.e.,  its 
outside-ce).  This  is  expressed  in  the  attribute-transfer  rules  of  the  rule  encoding  a  temporal 
abstraction:  the  control  environment  of  the  temporally  abstract  operation  is  the  outside-ce 
of  the  innermost  recursion  containing  the  iteration  cliche. 

4.1.4  Examples  of  Codifying  Simulation  Cliches 

We  used  the  Plan  Calculus  as  a  stepping  stone  in  capturing  our  cliches  and  then  encoding 
them  in  a  flow  graph  grammar.  This  section  gives  a  flavor  for  how  we  did  this.  It  shows  the 
plan  definitions  and  overlays  that  capture  some  of  the  cliches  that  were  described  in  English 
in  Chapter  2.  It  then  gives  the  grammar  rules  GRiSPR  uses  in  recognizing  these  cliches. 

Encoding  Event-Driven  Simulation  Cliches 

Recall  from  Section  2.1.3,  that  the  event-driven  simulation  algorithm  consists  of  the  follow¬ 
ing  key  steps: 
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Event-Driven  Sinmlation 

Figure  4-18:  Plan  definition  for  Event- Driven  Simulation  cliche. 


•  The  event-driven  simulator  is  given  an  initial  EVEIT,  whose  Object  is  a  starting  MESSAGE 
and  whose  Time  is  the  MESSAGE’S  arrival  time.  This  is  added  to  the  EVEIT-QUEUE, 

•  On  each  step  of  the  simulation,  the  highest  priority  EVEIT  is  pulled  from  the  EVEIT-QUEUE 
and  processed. 

•  Processing  an  EVEIT  means  simulating  the  handling  of  the  MESSAGE  in  the  EVEIT’s 
Object  part.  This  involves: 

-  looking  up  the  ASYICH-IODE  in  the  AODRESS-MAP  that  is  indexed  by  the  Destination- 
Address  part  of  the  MESSAGE. 

-  updating  the  ASYlCH-IODE’s  Clock  to  be  the  maximum  of  its  current  time  and 
the  Time  part  of  the  EVEIT.  This  creates  a  new  ASYICH-IODE. 

-  creating  a  new  aDDRESS-MAP  in  which  MESSAGE’S  Destination-Address  part  is  mapped 
to  the  new  ASYICH-IODE. 

-  handling  MESSAGE  in  the  context  of  the  ASYICH-IODE. 

•  The  event-driven  simulation  ends  when  the  EVEIT-QUEUE  is  empty. 

The  event-driven  simulation  algorithm  is  encoded  as  a  composition  of  two  temporally  ab¬ 
stract  operations,  called  Generate-Event-Queues-and-Nodes  and  Co- Earliest- EDS- Finished, 
and  a  Priority-Queue  Insert.  The  Priority-Queue  Insert  is  the  operation  performed  on  the 
first  step  of  the  simulation,  which  is  to  add  a  starting  EVEIT  to  the  EVEIT-QUEUE. 

The  temporally  abstract  operations  embody  the  following  temporally  abstract  view  of 
the  iterative  actions  of  the  simulator.  The  simulator  generates  two  sequences:  one  is  a 
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sequence  of  EVEIT-QUEUEs  and  the  other  is  a  sequence  of  ADDRESS-NAPs,  using  an  operation 
called  Generate-Event-Queues-and-Nodes.  It  does  this  by  repeatedly  applying  a  function 
that  extracts  the  highest  priority  element  (an  EVEIT)  from  the  EVEIT-QUEUE  and  processes 
it.  These  two  sequences  feed  into  a  temporally  abstract  operation  called  Co-Earliest-EDS- 
Finished.  This  operation  returns  the  ADDRESS-MAP  in  the  input  sequence  of  ADDRESS-NAPs 
that  corresponds  to  the  first  empty  EVEIT-QUEUE  in  the  other  input  sequence  of  EVEIT-QUEUEs. 
(These  two  operations  are  described  further  below.) 

Temporal  abstraction  allows  us  to  express  this  cliche  as  a  simple  composition  of  tempo¬ 
rally  abstract  operations.  The  complexity  of  how  data  feeds  back  during  iteration  and  how 
the  output  relates  to  the  exit  predicate  is  pushed  down  into  the  encoding  of  the  individual 
operations. 

Generate*- Event-  Q  ueues-and-Nodes 

Generate- Event-Queues-and-Nodes  is  a  temporal  abstraction  of  the  iteration  cliche  Dequeue- 
and-Process-Generation,  as  shown  in  the  overlay  in  Figure  4-19.  This  iteration  cliche  is  a 
special  case  of  the  generation  cliche.  The  generating  function  is  a  composition  of  Priority- 
Queue  Extract  and  Process- Event. 

This  is  slightly  more  complicated  than  the  generation  cliche  described  in  Section  4.1.3  in 
that  it  generates  two  sequences,  rather  than  one.  On  each  iteration,  the  generating  function 
is  applied  to  the  two  results  of  the  function’s  application  on  the  previous  iteration. 

Co-Earliest-EDS-Finished 

Co-Earliest-EDS-Finished  is  a  special  case  of  a  more  general  temporally  abstract  operation, 
called  Co- Earliest,  which  is  related  to  the  Earliest  operation  described  in  Section  4.1.3.  Co- 
Earliest  takes  two  input  sequences.  Si  and  S2,  and  a  predicate  and  it  returns  the  term  of  52 
that  corresponds  to  the  first  term  of  5i  satisfying  the  predicate.  Co-Earliest-EDS-Finished 
is  an  instance  of  Co-Earliest  in  which  the  predicate  is  a  test  for  whether  the  simulation  is 
finished. 

It  is  a  temporal  abstraction  of  the  Co-Iterative-EDS-Finished  iteration  cliche,  as  shown 
in  the  overlay  of  Figure  4-20.  This  iteration  cliche  is  the  iterative  fragment  that  terminates 
the  simulation  when  the  current  EVEIT-QUEUE  is  empty,  returning  the  current  value  of  the 
ADDRESS-NAP. 

The  temporaUy  abstract  operation  Co-Earliest-EDS-Finished  views  the  sequences  of 
EVEIT-QUEUEs  and  ADDRESS-NAPs  processed  over  the  iterations  as  its  two  inputs.  It  returns  the 
ADDRESS-NAP  in  the  sequence  of  ADDRESS-NAPs  that  corresponds  to  the  first  empty  EVEIT-QUEUE 
in  the  sequence  of  EVEIT-QUEUEs. 

The  grammar  rules  in  Figures  4-21  and  4-22  encode  the  information  in  the  plan  def¬ 
initions  and  overlays  discussed  so  far.  A  legend  specifies  port  type  abbreviations  used  in 
the  figure.  (The  plan  definitions,  overlays,  and  the  corresponding  grammar  rules  for  the 
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continue: 

Dequeue- 

Process- 

Generation 


EDS-Generate-as-Dequeue-Process-Generation 

Figure  4-19:  OveVlay  showing  the  temporal  abstraction  of  the  iteration  cliche  Dequeue-and- 
Process-Generation. 
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Co-Iterative-EDS-Finished 


Co-lterative-EDS-Finished-as-Co-Earliest-EDS-Finished 


Figure  4-20:  Overlay  showing  the  temporal  abstraction  of  the  iteration  cliche  Co-Iterative- 
EDS-Finished. 


2:S  Dequeue- 

4:S 

I^ace&s- 
.LPQ  Gtaaalion 

3:PQ 

>C^  2:S  Generate-  4:S 
Eveot-Queues- 
.'-PQ  and-Nodes  3:5. 


ABribuie-Tnasfa  Rules: 

1.  ce  :=  (outside-ce  ( innermost-recur  (n>  Dequeue-Process-Generation))) 


2:S  Dequeue-  4;S 
Process- 

J'PQ  Genetaiioo  ^•^9. 


Attribute  Conditions: 


a  3:! 

T*jl:PQ  Queue- 


ftocess- 


2;E^ - Hl-.E 


1.  (input-corresponds?  (p>  Process-Event  4) 

(p>  Priori ty-Queue-Extract  1) 

(feedback-ce  (innermost-recur  (n>  Priority-Queue-Extract) ) ) ) 

2.  (input-corresponds?  (p>  Process-Event  5) 

(p>  Process-Event  3) 

(feedback-ce  (innermost-recur  (n>  Priority-Queue-Extract) ) ) ) 

3.  (co-occur  (n>  Priori ty-Queue-Extract)  {n>  Process-Event))) 

Attribute-Transfer  Rules: 

1.  ce  :=  (ce  (n>  Process-Event))  -  -  - 

Legend: 

EsEveni 

PQ=Priority-Queue 

SsSequence 

AsAny 

AN=Asydi-Node 

M=Message 

I^Integer 


Figure  4-21:  Grammar  rules  for  some  Event-Driven  Simulation  cliches. 
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Attribute-Tmsfer  Rules; 

1.  ce  :=  (outside-ce  (innermost -recur  (n>  Co-Iterative-EDS-Finished))) 


(P.X) 


AtUibuleConfitions: 

1.  (exit-predicate  (n>  Priority-Queue-Empty?)) 

2.  (ce:  (ce-from  (st-thru>  2  3)) 

(success-ce  (n>  Priority-Queue-Empty?))) 

3.  (ce:  (ce-from  (output-edge  (recursive-node  (innermost-recur  (n>  Priority-Queue-Empty?))) 

(edge-sink  (st-thru>  2  3)))) 

(feedback-ce  (innermost-recur  (n>  Priority-Queue-Empty?)))) 

Attribute-Transfer  Rule: 

1.  ce  (ce  (n>  Priority-Queue-Empty?)) 

2.  success-ce  ::  (success-ce  (n>  Priority-Queue-Empty?)) 

2.  failure-ce  ::  (failure-ce  (n>  Priority-Queue-En^rty?)) 

Figure  4-22:  Grammar  rules  for  cliches  used  by  Event-Driven  Simulation  cliche. 
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Priority- Queue  operations  of  Empty?,  Insert,  and  Extract  are  not  shown  here,  since  they 
do  not  illustrate  any  new  points.) 

Process- Event 

The  plan  definition  for  the  Process-Event  cliche  is  shown  in  Figure  4-23.  This  cliche  consists 
of  the  four  operations  that  are  performed  when  an  event  is  processed  (as  described  at 
the  beginning  of  this  section):  looking  up  a  destination  ASYICH-IOOE,  updating  its  Clock, 
updating  the  AODRESS-MiP,  and  handling  the  MESSAGE. 

This  plan  contains  a  hierarchical  data  plan  within  it,  which  represents  the  EVERT  data 
cliche.  It  has  two  parts:  an  Object  (a  MESSAGE)  and  a  Time  (an  integer).  The  Object  part 
is  a  MESSAGE  data  plan,  which  has  four  parts.  The  Destination- Address  part  (an  integer)  is 
used  to  index  into  the  AOORESS-MAP  sequence  to  look  up  the  destination  ASYICH-IODE.  This 
ASYHCH~IODE  is  then  given  as  input  to  the  Update-Node-Time  cliche,  along  with  the  Time 
part  of  the  EVERT.  A  new  ASYRCH-RODE  is  returned  and  REW-TERM  is  used  to  insert  it  into  a 
copy  of  the  input  ADDRESS-NAP,  using  the  Destination-Address  part  of  the  MESSAGE  as  an 
index.  Finally,  a  Handle-Message  operation  is  used  to  simulate  the  handling  of  the  MESSAGE 
in  the  Object  part  of  EVERT.  This  operation  takes  the  new  ADDRESS-MAP  and  the  EVERT-QUEUE 
as  inputs,  as  well  as  the  MESSAGE,  and  returns  an  ADDRESS-NAP  and  EVERT-QUEUE. 

Figure  4-24  shows  the  rule  that  encodes  the  Process-Event  cliche,  plus  two  rules  that 
derive  the  non-terminals  Lookup- Destination  and  Record- at- Destination.  These  two  ad¬ 
ditional  rules  are  needed  because  we  cannot  directly  encode  the  hierarchical  data  plan  for 
EVERT  in  the  embedding  relation  of  one  grammar  rule.  Grammar  rules  can  only  represent  one 
level  of  aggregation  at  a  time.  (This  is  a  limitation  of  the  current  implementation  of  GRASPR. 
It  does  not  appear  to  reflect  an  inherent  difficulty  with  the  graph  parsing  approach.)  To  get 
around  this  limitation,  we  decompose  the  dataflow  graph  structure  of  the  plan  so  that  we 
separate  those  parts  that  access  parts  of  the  MESSAGE  from  those  that  access  the  EVERT.  We 
then  create  rules  taking  the  non-terminals  Lookup-Destination  and  Record-at-Destination 
to  the  sub-flow  graphs  representing  those  parts  that  access  the  parts  of  MESSAGE. 

The  rules  for  Lookup-Destination  and  Record-at-Destination  contain  embedding  rela¬ 
tions  in  which  a  left-hand  side  port  is  mapped  to  a  tuple  containing  some  empty  elements 
(denoted  by  asterisks).  This  represents  the  fact  that  not  all  of  the  parts  of  the  MESSAGE  data 
structure  are  used  by  the  operations  represented  by  nodes  on  the  rule’s  right-hand  side. 

Part  of  the  Process-Event  cliche  is  the  Handle-Message  operation.  We  have  grammar 
rules  that  encode  one  possible  cliched  implementation  of  this  operation.  (These  are  not 
shown  here,  since  they  are  more  of  the  same  type  we  have  seen  already.) 

However,  we  would  also  like  to  allow  Process-Event  (and  the  rest  of  the  Event-Driven 
Simulation  cliche)  to  be  recognized  in  simulators  in  which  the  Handle-Message  operation 
is  non-cliched.  That  is,  we  would  like  to  think  of  this  as  applying  a  non-cliched  function 
to  the  MESSAGE  which  simulates  the  handling  of  a  real  message  by  a  real  processing  node. 


1.50 


.'Event 


New-Event-Queue: 

Priority-Queue 


New-Address-Map: 

Sequence 


Process-Event 

Figure  4-23:  Plan  definition  for  the  Process-Event  cliche. 
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Old:  Asynch-Node 


Update-Node-Time 

Figure  4-25:  Plan  definition  for  the  Update- Node- Time  cliche. 

Unfortunately,  it  is  difficult  to  do  this  within  the  graph  parsing  framework.  It  would  require 
the  Handle- Message  non-terminal  in  the  rule  for  Process- Event  to  derive  an  arbitrary  flow 
graph.  In  general,  it  is  difficult  to  express  and  match  a  cliche  that  is  parameterized  over 
non-primitive,  non-cliched  functions.  (This  is  the  same  problem  we  ran  into  in  codifying  the 
generation  cliche  in  Section  4.1.3.  See  Section  5.2.3  for  more  discussion  of  this  problem.) 

Update-Node-Time 

Up  date- Node- Time  is  a  cliched  operation  that  synchronizes  an  ASYlCH-IODE’s  Clock  to  the 
current  “simulated  time,”  which  is  the  time  of  the  most  recent  EVEIT  pulled  from  the 
EVEIT-qUEUE.  The  operation  takes  a  ASYICH-IODE  and  the  simulated  time  (an  integer)  and 
returns  a  new  ASYICH-IODE  whose  Clock  is  either  the  simulated  time  or  the  time  of  the 
input  ASYICH-IODE ’s  Clock,  whichever  is  later.  The  plan  definition  of  this  operation  is 
shown  in  Figure  4-25.  An  ASYICH-IODE  has  two  parts:  a  Memory  (an  Associative  Set) 
and  a  Time  (an  Integer).  This  cliche  takes  an  ASYICH-IODE  and  an  integer  and  creates  a 
new  ASYICH-IODE  whose  Time  part  is  the  maximum  of  the  input  integer  and  Time  part  of 
the  input  ASYICH-IODE.  The  Memory  part  of  the  output  is  the  same  as  that  of  the  input 
ASYICH-IODE.  The  rule  that  encodes  this  plan  definition  is  shown  in  Figure  4-26. 

Enqueuing  New  Events 

One  of  the  actions  of  a  processing  node  that  is  simulated  as  part  of  the  simulation  of  message 
handling  is  the  creation  and  sending  of  new  messages.  One  of  the  constraints  on  the  event- 
driven  simulation  algorithm  is  that  whenever  a  message  send  is  simulated,  a  new  EVEIT 
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llnefBT 

Node  Node- 
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Z  ^ 
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_ _ _ 

(0^) 


Mnemonic  tuple  element  names: 

<Memory,Time> 

Figure  4-26;  Grammar  rule  encoding  the  Update-Node-Time  plan. 

must  be  created  and  added  to  the  EVEIT-QUEUE.  (Similarly,  in  the  synchronous  simulation 
algorithm,  when  the  message  handling  simulation  simulates  the  sending  of  a  m^  >sage,  the 
MESSAGE  that  represents  it  must  be  added  to  the  global  MESSAGE  buffer.) 

Unfortunately,  this  constraint  is  difficult  to  express  in  the  grammar  rule  encoding  and 
to  check  in  the  simulator  code.  Partly  this  is  because  the  node  action  simulation  code  is  not 
guaranteed  to  be  cliched,  so  we  have  no  context  in  which  to  express  the  constraint.  Another 
reason  is  that  the  part  of  the  simulation  code  that  performs  the  activity  of  enqueuing  new 
EVEITs  (or  MESSAGES)  is  typically  given  as  input  to  the  simulator.  So,  it  is  not  available  for 
analysis,  (As  discussed  in  Section  2.2,  PiSia  takes  as  input  a  set  of  functions  each  of  which 
specifies  how  to  simulate  the  actions  of  a  node  in  executing  some  machine  operation.  Some 
of  these  functions  create  new  EVEITs  and  enqueue  them.)  These  problems  are  discussed 
further  in  Section  5.2.4. 

Although  this  constraint  is  difficult  to  express  and  check  within  the  current  graph  parsing 
framework,  it  is  not  a  hard  constraint  for  a  person  to  check.  It  might  be  easier  to  just  ask 
the  user  whether  the  constraint  holds.  This  question  can  be  asked  with  reference  to  the 
particular  locations  in  the  program,  corresponding  to  locations  in  the  input  graph  where 
the  Handle- Message  operation  is  likely  to  occur.  (This  can  be  based  on  where  the  rest  of 
Process-Event  has  been  found.) 

4.2  Architectural  Details 

This  section  fills  in  details  of  how  flow  graph  parsing  is  used  to  solve  the  partial  program 
recognition  problem.  Section  4.2.1  describes  how  textual  source  code  is  translated  into  an 
attributed  flow  graph.  Section  4.2.2  discusses  an  additional  monitor  that  tailors  the  parser 
to  deal  with  a  type  of  graph  variation  that  is  specific  to  the  program  recognition  application. 
Section  4.2.3  describes  how  the  Paraphraser  presents  the  parser’s  results. 

4.2.1  IVanslating  Programs  to  Flow  Graphs 

A  program  is  translated  from  source  code  to  attributed  flow  graph  in  two  stages.  First,  a 
plan  representation  of  the  source  code  is  created.  Then,  an  attributed  flow  graph  is  com- 
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puted  from  this  intermediate  representation.  Creating  the  intermediate  plan  representation 
of  the  code  facilitates  the  computation  of  attributes  for  the  flow  graph. 

Source  Code  to  Plan  Diagram 

The  plan  creation  stage  is  itself  composed  of  two  stages:  macro-expansion,  followed  by 
symbolic  evaluation.  The  macro-expander  translates  the  program  into  a  simpler  language 
of  primitive  forms.  It  does  this  by  expanding  any  macro  calls  in  the  source  program  and 
by  using  a  set  of  additional  macro-like  definitions  to  expand  each  complex  construct  in  the 
source  into  a  set  of  simpler  forms.  Ir  particular,  all  of  the  control  constructs  are  converted 
to  simple  conditional  and  unconditional  branches.  All  of  the  data  constructs  are  converted 
into  bindings  of  or  assignments  to  simple  atomic  variables. 

The  macro-expanded  code  is  then  symbolically  evaluated.  The  evaluator  follows  all 
possible  control  paths  of  the  program,  starting  with  some  topmost  (“main”)  function  of 
the  program.  It  converts  operations  to  boxes  and  places  arcs  between  them,  corresponding 
to  data  and  control  flow.  Whenever  a  branch  in  control  flow  occurs,  a  test  box  is  added. 
Similarly,  when  control  flow  comes  back  together,  a  join  box  is  placed  in  the  graph  and  all 
data  representing  the  same  variable  are  merged  together. 

Boxes  for  user-defined  functions  are  replaced  with  the  plans  for  their  definitions,  except 
for  those  within  recursive  functions.  This  flattening  allows  variability  in  the  way  programs 
to  be  analyzed  are  broken  down  into  subroutines.  The  user  may  also  advise  that  certain  calls 
not  be  expanded  for  efficiency  reasons.  (Any  unexpanded  function  whose  name  happens  to 
be  a  non-terminal  in  the  grammar  is  systematically  renamed,  unless  the  user  specifies  that 
the  function  is  an  instance  of  the  cliche  named  by  the  non-terminal.) 

The  symbolic  evaluator  inserts  explicit  selector  and  constructor  boxes  into  the  plan 
diagram  for  each  user-defined  accessor  and  constructor. 

The  plan  representation  may  be  used  as  the  target  representation  for  many  different 
languages.  The  flow  analyzer  used  by  GRASPR  translates  Lisp  programs  into  plans.  Similar 
analyzers  were  previously  written  not  only  for  Lisp  ([114,  137,  139]),  but  also  for  subsets  of 
Cobol  [42],  Fortran  [137],  and  Ada  [139],  but  are  not  used  in  this  system. 

Plan  Diagram  to  Attributed  Flow  Graph 

Once  the  plan  representation  for  the  program  is  created,  it  is  encoded  as  an  attributed  flow 
graph.  The  dataflow  structure  of  the  plan  is  retained  in  the  flow  graph.  Control  environment 
attributes  are  computed  from  the  control  flow  structure.  Joins  are  replaced  with  edges  that 
fan  in,  annotated  with  ce-from  attributes.  Explicit  accessors  and  constructors  are  also 
replaced  by  attributed  edges.  Each  accessor  and  composition  of  accessors  is  treated  as  a 
Spread  node  and  each  constructor  as  a  Make  node.  These  Spreads  and  Makes  are  removed 
using  the  aggregation-removal  transformations  described  in  Section  3.4.2.  The  residual 
Spreads  and  Makes  are  then  replaced  with  attributed  fan-out  and  fan-in  edges. 
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(delun  Insart-Qnau*  (Entry) 

(cond  ((Eapty-or-Lov-Priority-Haad?  Entry  *Evant-Qnaue*) 
(push  Entry  aETent-Qnena*)) 

(t  (let  ((lext  (cdr  *Event -Queue*)) 

(Previous  *Event-Queue*)) 

: :  find  spot  to  splice  Entry  in: 

(loop  do 

(vhen  (Empty-or-Low-Priority-Head?  Entry  lext) 
(return) ) 

(setq  Previous  lext) 

(setq  lext  (cdr  lext))) 

;;  perfon  the  splice: 

(rplacd  Previous  (cons  Entry  lext)))))) 


Figure  4-27:  Code  that  side  effects  the  mutable  data  structure  *Event-queue*. 

4.2.2  Additional  Monitor  to  Handle  Recursion  Unfolding 

One  of  the  types  of  variations  that  can  arise  in  recursive  programs  is  that  a  loop  in  one 
can  be  unrolled  in  another,  or  more  generally,  a  recursion  can  be  unfolded.  This  variation 
arises  in  our  program  examples  when  we  convert  the  impure  programs  to  pure  ones  (having 
no  side  effects  to  mutable  objects).  In  this  situation,  special  cases  of  a  recursion  sometimes 
translate  to  the  general  recursive  case.  This  means  that  the  general  case  is  redundantly 
performed  once,  before  the  recursion  is  called. 

For  example,  the  code  in  Figure  4-27  destructively  inserts  Entry  into  the  ordered  asso¬ 
ciative  list  *Ev«nt -Queue*.  It  first  tests  for  the  special  case  in  which  Entry  belongs  on  the 
front  of  the  list  (either  because  the  list  is  empty  or  its  first  element  has  a  lower  priority 
than  Entry).  In  this  case,  it  destructively  places  Entry  on  the  front  of  *Event-Queue*  using 
push.  Insert-Queue  then  performs  the  general  case  in  which  *Event-Queue*  is  searched  for 
the  place  to  insert  Entry  and  then  Entry  is  spliced  in  at  that  place. 

When  this  program  is  translated  into  its  non-destructive  version,  shown  in  Figure  4-28, 
the  special  case  head  insertion  becomes  the  same  as  the  normal  splice-in  operation. 

Insert -Queue-Pure  can  be  rewritten  as  Folded-Insert-Queue,  shown  in  Figure  4-29,  in  which 
the  recursion  is  folded  back  up. 

To  deal  with  this  type  of  variation,  we  provided  an  additional  monitor  to  the  flow 
graph  parser,  which  looks  for  an  opportunity  to  view  a  program  that  contains  an  unfolded 
recursion  as  one  in  which  the  recursion  is  folded  back  up.  By  generating  this  alternative 
view,  the  parser  is  then  able  to  recognize  the  program  as  if  it  did  not  have  an  unfolded 
recursion.  This  augmentation  of  the  parser  with  a  new  monitor  tailors  it  to  solve  a  problem 
specific  to  its  application  to  the  program  recognition  problem.  This  section  describes  the 
new  monitor  and  how  the  new  view  is  generated. 
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(defun  Insert-Queue-Pure  (Entry) 

(sstq  *Event-Quaue* 

(cond  ((E*pty-or-LoB-Priority-H«ad?  Entry  *Event-Quaae*) 
(cons  Entry  *Event-Queue*)) 

(t  (cons  (car  «Evsnt-Queuo*) 

(Splice-in  Entry  (cdr  *Event-Quene*))))))) 

(delun  Splice-In  (Entry  lezt) 

(cond  ((Enpty-or-LoB-Priority-Ho«i?  Entry  lezt) 

(cons  Entry  lezt)) 

(t  (cons  (car  lezt) 

(Splice- In  Entry  (cdr  lezt)))))) 


Figure  4-28:  Functional  version  of  Insert-Queue. 


(defun  Folded-Insert-Queue  (Entry) 

(setq  eEvent-Queue*  (Splice-In  Entry  eEvent-Queue* ) ) ) 

(defun  Splice-In  (Entry  lezt) 

(cond  ( (Empty-or-Low-Priority-Head?  Entry  lezt) 

(cons  Entry  lezt)) 

(t  (cons  (car  lezt) 

(Splice-In  Entry  (cdr  lezt)))))) 


Figure  4-29:  Version  of  Insert-Queue-Pure  in  which  recursion  is  folded  up. 
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Recursion  information:  [recur-ce:  ce5,  feedback-ce:  ce4,  outside-ce:  ce3] 

Figure  4-31:  Partial  ordering  relationships  between  the  control  environments  of  Insert- 
Queue-Pure’s  flow  graph. 

Figure  4-30  shows  the  flow  graph  representation  of  Insert-Qu*ue-Pure.  A  dashed  box 
is  drawn  around  the  boundary  of  the  sub-flow  graph  representing  its  recursion.  GRASPR 
generates  an  alternative  view  of  this  flow  graph  in  which  the  recursion  boundary  is  expanded 
outward  and  the  redundant  computation  is  collapsed  together. 

The  way  it  works  is  based  on  the  observation  that  when  GRASPR  tries  to  recognize  an 
unfolded  program,  most  of  the  constraints  (structural  as  well  as  attribute  conditions)  are 
satisfied.  The  only  ones  that  are  not  are  those  that  refer  to  the  program’s  recursion  in¬ 
formation  (e.g.,  those  constraining  two  ports  to  input-correspond  or  those  referring  to  the 
feedback-ce  of  the  recursion). 

So,  constraints  are  placed  into  two  classes:  regular  and  recursion.  When  an  item  fails 
only  its  recursion  constraints,  it  is  suspended,  which  means  it  is  placed  in  a  holding  data 
structure  used  by  the  new  monitor.  The  monitor  watches  for  another  complete  item,  called 
a  partner,  to  be  added  to  the  chart  that  can  collapse  with  the  suspended  item.  An  item 
Is  can  collapse  with  another  item  Ip  if  they  are  recognizing  the  same  non-terminal  type 
in  control  environments  that  are  analogous.  (This  relation  is  defined  below.)  Collapsing 
two  items  means  creating  a  new  item  which  is  the  same  as  the  suspended  item,  but  whose 
constraints  are  checked  in  the  context  of  the  partner  item. 

Intuitively,  two  control  environments  are  analogous  if  they  contain  operations  that 
would  collapse  together  if  the  recursion  were  folded  back  up.  For  example.  Figure  4- 
31  shows  the  partial  ordering  of  the  control  environments  and  recursion  information  for 
Insert-Queue-pure.  The  analogous  pairs  of  control  environments  are  (cel,ce5),  (ce2,ce3), 
and  (ce3,ce4). 

The  analogy  relations  are  symmetric,  but  not  reflexive,  or  transitive.  Analogy  relations 
between  control  environments  are  computed  from  the  surface  plan  during  its  translation  to 
an  attributed  flow  graph. 

Once  a  suspended  item  is  collapsed  with  a  partner,  the  new  “collapsed”  item  is  added 
to  the  agenda.  Its  constraints  are  satisfied  because  they  refer  to  attributes  of  the  sub-flow 
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graph  matched  by  the  partner  item.  The  coUapsed  item’s  left-hand  side  control  environment 
attributes  are  computed  by  applying  the  rule’s  attribute-transfer  rules  in  the  context  of  the 
partner  item  and  then  translating  them  to  the  analogous  control  environment.  (Attribute- 
transfer  rules  that  use  recursion  information  in  their  computation  are  handled  specially.  In 
particular,  if  the  rule  computes  the  jutside-ce  of  the  innermost  recursion  containing  some 
node,  the  control  environment  analogous  to  the  recur-ce  of  this  recursion  is  transferred.) 

When  a  collapsed  item  is  used  to  extend  another  item,  it  imposes  new  edge  connection 
constraints  on  the  items  for  adjacent  non-terminals.  Suppose  a  collapsed  item  I  a,  having 
partner  Ip  extends  another  item  to  create  an  item  Ic,  where  I  a  is  representing  the  derivation 
of  non- terminal  A  in  the  right-hand  side  of  /c’s  rule.  If  an  item  Ib  for  a  non-terminal 
adjacent  to  A  has  a  partner  /,,  then  Ip  and  /,  should  be  connected  together  in  the  same 
way  as  I  a  and  Is- 

The  suspend-collapse-resume  mechanism  for  recursion  folding  can  be  generalized  to  a 
“try-harder”  technique  for  handling  more  types  of  near-misses  besides  those  that  fail  recur¬ 
sion  constraints.  More  classes  of  constraints  can  be  identified.  When  an  item  fails  certain 
classes  of  constraints,  something  might  be  done  to  cause  them  to  be  satisfied  (e.g.,  changing 
an  attribute)  or  weakened  (e.g.,  changing  a  co-occurence  condition  between  two  nodes  to  a 
C  condition).  Then  the  item  can  be  resumed  simply  by  putting  it  back  on  the  agenda.  The 
changes  can  be  reported  as  conditions  or  assumptions  under  which  some  cbche  is  recognized 
in  the  program. 

4.2.3  Paraphraser 

The  output  of  the  recognition  process  is  a  forest  of  design  trees,  representing  the  cliches 
found  and  how  they  relate  to  each  other.  One  way  to  use  this  output  is  to  automatically 
generate  documentation  for  the  program  recognized.  Paraphraser  is  a  tool  which  takes  the 
forest  of  design  trees  produced  by  GRASPR  and  generates  textual  documentation  for  each. 
Each  cliche  in  our  library  has  an  associated  schematized  textual  explanation  fragment  whose 
slots  may  be  fiUed  in  with  identifiers  in  the  program.  (This  is  based  on  earlier  work  by 
Cyphers  [24]  and  Frank  [45].) 

Paraphraser  starts  at  the  root  of  a  design  tree  and  traverses  it  depth  first,  generating  a 
hierarchical  description  based  on  the  explanation  fragments  associated  with  each  cliche  en¬ 
countered.  It  reports  the  relationships  between  each  cliche  in  the  tree  and  those  immediately 
below  it  (e.g.,  Queue-Insart  is  implemented  by  FIFO-Enqueue,  Sum  temporally  abstracts 
Summing).  If  an  implementation  relationship  exists  between  two  cliches  and  a  data  abstrac¬ 
tion  is  uncovered,  this  is  reported  as  well  (e.g..  The  Queue  is  implemented  as  a  FIFO.). 

Variable  names  are  included  in  the  text  to  indicate  the  location  of  the  cliche.  Also,  some 
slots  in  the  explanation  fragments  are  filled  in  with  primitive  operation  types,  such  as  < 
in  An  element’s  priority  P  is  higher  than  emother’s  Q,  il  P  <  Q.  This  often  happens 
when  generalized  node  types  are  used.  In  this  case  the  generalized  node  type  matched 


160 


any  primitive  predicate  that  was  a  comparator.  Paraphraser  is  also  able  to  compute  some 
mappings  from  user-defined  data  structure  part  names  to  the  part  names  of  aggregate  data 
cliches  that  are  recognized.  This  is  described  below. 

The  user  can  select  which  design  trees  to  document.  By  default,  Paraphraser  documents 
aU  of  them,  starting  with  those  whose  roots  are  at  the  highest  level  in  the  library.  Currently, 
all  cliches  recognized  are  reported,  including  those  that  represent  multiple  views  of  some  part 
of  the  program.  No  single  best  interpretation  is  preferred.  We  view  the  job  of  selecting  views 
of  the  program  and  focusing  on  particxilar  results  of  the  recognition  as  the  responsibility  of 
a  higher-level  control  mechanism  which  has  information  about  how  the  results  will  be  used 
and  which  view  of  the  program  is  most  useful. 

Mapping  Cliched  Aggregate  Names  to  User-Defined  Data  Structure  Names 

Paraphraser  heuristically  computes  mappings  from  the  names  of  user-defined  data  structures 
and  their  parts  to  those  of  aggregate  data  cliches  that  are  recognized  in  the  program. 
However,  the  current  implementation  is  not  robust.  The  mappings  are  often  incomplete 
and  ambiguous.  (This  is  an  area  requiring  further  work.) 

The  names  of  user-defined  data  structures  and  their  parts  are  associated  with  edges  in 
the  program’s  flow  graph  in  the  form  of  accessor  and  constructor  attribute  values.  Each 
accessor  attribute  has  a  value  that  describes  how  the  data  it  carries  to  the  edge’s  sink  is 
a  part  of  the  data  structure  at  the  edge’s  source.  Because  data  structure  accesses  and 
constructions  can  be  composed,  the  values  of  these  attributes  are  sets  of  ordered  lists  of 
tuples  of  the  form  <structure-type  part-naae>,  where  the  order  corresponds  to  the  order 
of  composition  of  the  accesses  or  constructions.  They  are  sets  of  ordered  lists  because  an 
edge  can  represent  dataflow  from  more  than  one  output  of  a  selector  to  more  than  one 
input  of  a  constructor.  For  example,  in  the  flow  graph  representing  (1+  (queue-length 
(node-queue  (arel  enodes*  i)))),  the  edge  from  the  output  of  “aref”  to  the  input  of 
has  an  accessor  attribute  of  value  (<Hode  queue>  <Queue  Length>). 

Each  ordered  list  can  be  seen  as  a  “path”  that  describes  how  the  source  data  structure 
is  destructured  to  result  in  the  piece  of  data  at  the  sink.  The  path  may  be  of  arbitrary- 
length,  since  the  piece  of  data  may  be  nested  deeply  within  several  data  structures. 

Similarly,  each  edge  holds  a  constructor  attribute  that  describes  how  the  data  it  carries 
becomes  part  of  some  data  structure.  The  value  of  the  accessor  and  constructor  attributes 
is  undefined  if  the  edge  is  not  carrying  data  involved  in  some  aggregation. 

The  edge  attributes  are  used  to  create  the  mappings  between  names  in  cliched  structures 
and  in  user-defined  ones.  When  an  operation  on  a  cliched  aggregate  data  structure  is 
recognized,  the  parser  has  matched  each  part  of  the  structure  to  an  edge  (or  recursively 
to  a  tuple  of  sub-part  matchings,  if  the  part  itself  is  an  aggregation).  This  creates  a  tree 
representing  the  cliched  aggregate  data  structure’s  organization,  with  the  leaves  matching 
edges  in  the  flow  graph  representing  the  program.  Those  accessor  and  constructor  values 
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FIFO  Dequeus  is  inplemented  as  a  Circular 
Sequence  Extract.  The  FIFO  is  implemented  as  a  CIS. 

Circular  Indexed  Sequence  Extract  extracts  the 
first  element  from  the  Circular  Indexed  Sequence. 

The  First  part:  (<10DE  QUEUE>  CQUEUE  HEAO>) 

The  Fill-Count  part:  (<IODE  QUEUE>  CQUEUE  LEIGTH>) 

The  Size  part:  (<IODE  qUEUE>  <QUEUE  DATA-SIZE>) 

The  Base  part:  (<IODE  qUEUE>  <qUEUE  DATA>} 

Figure  4-32:  Documentation  containing  a  cliched-to-user-defined  name  mapping. 

that  are  defined  are  combined  to  form  trees  that  represent  the  portions  of  the  user-defined 
data  structure  organization.  (There  may  be  more  than  one  if  the  recognition  involves  parts 
from  more  than  one  user-defined  data  structure.)  The  fringes  of  these  trees  are  matched 
to  the  fringes  of  the  cliched  organization  tree.  This  generates  mappings  between  the  part 
names  of  the  lowest  level  structures  involved.  Mappings  between  higher  level  nodes  of  the 
trees  are  heuristically  computed.  For  example,  if  all  parts  of  a  cliched  data  structure  map 
to  all  parts  of  a  user-defined  structure,  then  the  two  data  structures  map  to  each  other. 

Equality  constraints  are  imposed  locally  by  the  rules  for  cliche  data  structure  operations. 
These  require  that  each  cliched  part  name  map  consistently  to  the  same  programmer-defined 
part  name  (or  set  of  names,  if  there  is  ambiguity  in  which  attributes  match). 

Figure  4-32  gives  an  example  of  a  mapping  computed  from  the  recognition  of  a  CIS- 
Extract.  The  mapping  is  included  in  the  documentation  of  this  cliche.  This  mapping  is 
incomplete  in  that  the  “Last”  part  of  the  Circular  Indexed  Sequence  is  not  mapped  to 
anything.  This  is  because  in  the  program,  the  optional  unconstrained  straight-through  rep¬ 
resenting  the  “Last”  part  was  not  matched.  Because  not  aU  of  the  parts  of  the  cliched 
data  structure  are  mapped,  the  mapping  cannot  be  refined.  If  La^t  were  mapped  to 
(<I0DE  qUEUE>  <qUEUE  TAIL>),  then  since  the  user-defined  data  structure  qUEUE  has  no  more 
parts,  qUEUE  can  be  mapped  to  CIS  and  each  of  the  part  mappings  can  be  reduced  from 
(<H0DE  qUEUE>  <qUEUE  x>)  to  (<qUEUE  x>).  If  “Last”  were  mapped  to  (<IODE  MAX-IIDEX>), 
and  BODE  had  only  parts  “Queue”  and  “Max-Index,”  then  MODE  would  be  mapped  to  CIS 
and  the  mappings  would  remain  the  same  (i.e.,  not  be  reduced). 

Ambiguity  arises  when  an  accessor  or  constructor  attribute  has  a  set  of  values  that  are 
mapped  to  some  cliched  part.  It  also  occurs  when  some  part  of  a  program  is  recognized  as 
more  than  one  data  structure  operation. 

In  addition  to  these  local  refinements  to  the  meppings,  global  constraint  propagation 
should  be  used  to  refine  them  further.  Future  research  will  focus  on  this.  The  results 
can  be  valuable  not  only  in  presenting  the  results  of  recognition,  but  also  as  a  source  of 
expectations  which  can  be  used  to  further  guide  and  refine  data  structure  recognition.  (See 
Section  7.2.3.) 
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Chapter  5 


Capabilities  and  Limitations 


There  are  two  parts  of  our  analysis  of  the  graph  parsing  approach.  One  is  identifying  its 
practical  capabilities  and  limitations  in  the  context  of  real-world  programs.  The  other  is 
studying  the  computational  cost  of  this  approach.  This  chapter  discusses  the  first  aspect, 
while  Chapter  6  deals  with  the  second.  In  this  chapter,  we  consider  both  the  robustness  of 
our  recognition  technique  under  common  program  variations  and  the  expressiveness  of  our 
graph  grammar  formalism  for  encoding  programming  cliches. 

5.1  Variations  Tolerated 

Automated  recognition  of  cliches  must  be  robust  under  a  wide  range  of  variations  in  pro¬ 
grams.  We  employ  three  basic  strategies  for  achieving  this  goal.  First,  we  use  an  abstract 
representation  for  programs  and  cliches.  This  representation  suppresses  many  details  which 
can  vary  across  programs  but  which  do  not  constitute  significant  differences  between  the 
cliches  that  exist  in  the  programs.  Our  representation  exposes  the  algorithmic  and  dataflow 
structure  of  the  program,  while  abstracting  away  syntactic  and  organizational  differences. 

When  some  unimportant  details  are  not  suppressed  by  our  representation  (i.e.,  when 
two  or  more  program  variations  are  not  represented  the  same),  we  try  a  second  strategy.  We 
provide  ways  for  GRASPR  to  generate  cheap  alternative  views  of  the  program  representation. 
These  views  are  created  by  additional  chart  monitors  during  parsing,  such  as  those  that 
deal  with  redundancy. 

It  is  possible  to  also  handle  this  in  a  pre-processing  stage  (rather  than  during  parsing) 
by  choosing  one  variation  as  canonical  and  applying  cheap  transformations  to  canonicalize 
other  variations  with  respect  to  this  one.  However,  sometimes  seeing  the  transformation 
opportunity  requires  performing  recognition.  For  example,  zipping  up  two  instances  of  an 
abstract  operation  that  each  involve  a  different  implementation  requires  recognition  to  view 
the  redundant  code  as  performing  the  same  operation. 

When  a  cliche  exists  in  two  programs  that  are  not  represented  the  same  in  our  represen¬ 
tation  or  cannot  be  cheaply  viewed  as  the  same,  we  fall  back  on  our  third  strategj-.  This  is 
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to  enumerate  the  variations  in  our  library.  For  example,  we  use  this  tactic  to  deal  with  im¬ 
plementation  variation.  However,  when  enumerating  variations,  we  rely  on  our  knowledge 
of  the  empirical  frequency  of  occurrence  of  the  variations.  We  do  not  collect  every  variation 
of  a  cliche  we  can  think  of,  only  those  that  are  common.  The  hierarchical  structure  of  the 
cliche  library  helps  to  make  the  enumeration  concise. 

These  three  tactics  allow  us  to  automate  program  recognition  so  that  it  is  robust  under 
the  common  program  variations  described  in  Section  2.3.1.  Our  abstract  representation 
eliminates  syntactic  and  organizational  variation,  as  well  as  variation  due  to  delocalization, 
unfamiliar  code,  and  some  function-sharing  optimizations.  This  is  discussed  in  more  detail 
in  Sections  5. 1.1 -5. 1.5.  By  generating  alternativ.j  views  cheaply,  GRASPR  is  able  to  deal 
with  variation  due  to  redundancy,  as  is  discussed  in  Section  5.1.6.  Because  implementation 
variations  are  concisely  enumerated  in  the  cliche  library,  GRASPR  is  able  to  recognize  the 
same  abstract  cliched  operation  in  programs  that  contain  different  implementations  of  the 
operation.  This  is  discussed  in  Section  5.1.7. 

5.1.1  Syntactic  Variation 

In  Section  2.3.2,  we  showed  two  programs  (in  Figures  2-10  and  2-11)  which  GRASPR  recognized 
as  containing  the  same  cliches,  even  though  they  differ  syntactically.  This  is  due  to  the  fact 
that  both  programs  are  represented  as  the  same  flow  graph,  shown  in  Figure  5-1. 

The  figure  does  not  show  the  complete  flow  graph.  Some  function  calls  are  depicted  as 
nodes  for  brevity.  However,  they  are  sub-flow  graphs  in  the  actual  representation.  These 
nodes  are  drawn  with  dotted  lines  to  show  that  they  hide  some  detail.  Also,  dashed  lines 
are  drawn  around  the  sub-flow  graph  representing  the  recursive  function  Execute-Events. 
(Small  filled-in  circles  indicate  fan-in  and  fan-out.  They  are  not  special  vertices  in  the  flow 
graph.  They  are  used  to  distinguish  edges  that  share  sinks  or  sources  from  those  that  merely 
cross  each  other.) 

Accessor  and  constructor  attributes  on  edges  are  not  shown  in  the  figure  because  they 
differ  for  the  two  programs.  Instead,  the  edges  for  which  these  attributes  have  defined  values 
(i.e.,  not  undefined)  are  labeled  <el>,  ...,<e7>.  Figure  5-2  lists  the  actual  attribute  values 
for  these  edges  for  the  programs  of  Figures  2-10,  2-11,  as  well  as  Figure  2-12. 

The  flow  graph  representation  abstracts  away  syntactic  differences  between  programs. 
Attributed  dataflow  edges  explicitly  represent  the  net  effect  of  binding  and  control  con¬ 
structs,  abstracting  away  such  details  as  which  constructs  are  used,  which  variables  are 
bound,  and  whether  data  is  passed  through  nested  expressions  or  via  bindings  to  interme¬ 
diate  variables. 

Information  concerning  the  names  of  user-defined  data  structures  and  their  parts  is 
relegated  to  edge  attributes,  so  that  differences  due  to  explicit  accessor  and  constructor 
functions  do  not  arise  in  the  structure  of  the  graph. 

Also,  the  representation  captures  only  “essential”  orderings  of  operations,  which  are 
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Figure  5-1:  Flow  graph  representing  the  code  in  Figures  2-10,  2-11,  and  2-12. 
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<el>:  Accessor: 

Constructor: 
<e2>:  Accessor: 

Constructor: 
<e3>:  Accessor- 

Constructor: 
<e4>:  Accessor- 

Constructor: 
<e5>:  Accessor: 

Constructor: 
<e6>:  Accessor: 

Constructor: 
<e7>:  Accessor- 

Constructor: 


undefined 

f(<Message  Arguments>  <Event  Object>)) 
undefined 

f(<Message  Length>  <Event  Object>)} 
undefined 

{(<Message  Type>  <Event  Object>)} 

{(<Node  Time>)} 
f(<Event  Time>)} 

undefined 

{(<Message  Destination>  <Event  Object>)} 

{(<Handler  Arity>)) 

undefined 

{ ( <Handler  Number-of-Locals>  )} 
undefined 

a 


undefined 

f(<Msg  Args>  <Event  Object>)} 
undefined 

f(<Msg  Storage-Length>  <Event  Object>)J 
undefined 

f(<Msg  Type>  <Event  Object>)J 

{(<Node  Time>)J 
((<Event  Time>)J 

undefined 

f(<Msg  Dest-Addr>  <Event  Object>)} 

{(<HandlerArity>)} 

undefined 

f(<HandlerNumber-of-Locals>)) 

undefined 

b 


<el>;  Accessor:  undefined 

Constructor:  { ( <Handler-Data  Arguments>  <Msg  Data>)j 

<e2>:  Accessor:  undefined 

Constructor:  ( ( <Handler-Data  Length>  <Msg  Data>)} 

<e3>:  Accessor:  undefined 

Constructor:  ( ( <Handler-Data  Typo  <Msg  Data> ) ) 

<e4>:  Accessor:  ((cNodeTimo)} 

Constructor:  ((<Msg  Arrivcd-Timo)} 

<e5>:  Accessor:  undefined 

Constructor:  {(cMsg  Destination>)} 

<e6>:  Accessor:  {(<Handler  Arity>)) 

Constructor:  undefined 

<e7>:  Accessor:  {(<HandlerNumber~of-Locals>)} 

Constructor:  undefined 

c 


Figure  5-2:  Attribute  values  for  accessor  and  constructor  attributes  annotating  the  flow 
graphs  representing  the  programs  in  Figures  2-10  (column  a),  2-11  (column  b),  and  2-12 
(column  c). 


those  determined  by  dataflow  dependencies.  Dataflow  graphs  make  dataflow  dependencies 
explicit,  imposing  a  partial  ordering  on  the  program’s  operations  (rather  than  the  linear,  to¬ 
tal  ordering  imposed  by  text).  So  programs  which  vary  only  in  their  ordering  of  independent 
computations  will  have  the  same  flow  graph  representation. 

The  attributed  flow  graph  representation  also  captures  constraints  on  data  and  control 
flow,  independent  of  the  language  in  which  they  are  expressed.  This  means  the  same  library 
of  cliches  can  be  used  to  recognize  cliches  regardless  of  the  language  in  which  the  program 
containing  them  is  written.  If  the  data  and  control  flow  of  a  program  can  be  statically 
determined,  then  the  program  can  be  represented  as  an  attributed  flow  graph.  This  is 
true  for  most  imperative,  sequential  programs  written  in  conventional  languages,  such  as 
Fortran,  Cobol,  Lisp,  and  Ada. 

Some  examples  of  programs  for  which  this  is  not  true  are  those  that  contain  nondeter- 
ministic  or  concurrent  language  features.  Also,  programs  that  take  other  programs  as  input 
cannot  be  fully  modeled  by  our  dataflow  graph  representation  because  part  of  their  data 
and  control  flow  information  is  hidden  in  their  input.  (This  is  discussed  further  in  Section 
5.2.) 

The  abstraction  properties  of  the  flow  graph  representation  enable  cliches  to  be  rec¬ 
ognized  in  programs  without  having  to  anticipate  (and  enumerate)  all  possible  syntactic 
variations  of  each  cliche  and  without  relying  on  source-to-source  transformations  to  canon- 
icalize  the  code. 

5.1.2  Organizational  Variation 

The  flow  graph  representation  is  also  the  key  to  dealing  with  variation  in  how  programs 
are  decomposed  into  subroutines  and  how  aggregate  data  structures  are  organized.  In 
this  representation,  the  subroutine  structure  is  flattened.  Each  call  to  a  subroutine  is 
represented  by  the  flow  graph  of  the  subroutine’s  body.  In  essence,  the  program  is  seen 
as  complete. y  open-coded.  The  key  benefit  of  this  is  that  instances  of  cliches  which  cross 
subroutine  boundaries  are  recognized  as  easily  as  those  that  are  within  a  boundary.  The 
hierarchical  organization  of  cliches  built  upon  other  cliches  need  not  be  reflected  in  the 
program’s  d '^composition  for  the  cliches  to  be  recognized. 

Of  course,  flattening  all  subroutine  calls  is  not  always  advantageous.  When  a  subrou¬ 
tine  is  used  in  several  places  throughout  the  code  and  contains  cliches  entirely  within  its 
boundaries,  flattening  it  unnecessarily  creates  a  large  input  flow  graph  and  causes  GRASPR 
to  repeat  work.  For  example,  utility  subroutines  for  basic  data  structures  often  contain 
general-purpose  cliches  entirely  within  their  boundaries  and  they  are  usually  called  by  sev¬ 
eral  higher-level  functions.  In  this  case,  the  subroutines  should  be  recognized  independently. 
The  results  of  recognition  should  then  be  duplicated  and  used  wherever  the  subroutine  was 
called.  For  example,  if  a  subroutine  is  recognized  as  a  cliche,  calls  to  it  in  the  program  should 
be  represented  as  an  already-reduced  non-terminal,  which  can  be  used  in  the  recognition  of 
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higher  level  cliches.  This  involves  simply  adding  complete  items  to  the  chart,  representing 
already-reduced  non-terminals. 

Besides  eliminating  variation  due  to  subroutine  decomposition,  GRiSPR  also  deals  with 
variation  in  data  structure  organization.  It  does  this  by  representing  accessors  and  con¬ 
structors  as  attributed  edges,  rather  than  as  explicit  nodes  in  the  flow  graph,  as  are  other 
operations  in  the  program.  If  the  accessors  and  constructors  were  represented  explicitly 
as  nodes,  then  the  representation  would  fail  to  eliminate  variation  between  programs  that 
aggregate  the  same  data,  but  use  difl[erent  orderings  of  parts  or  different  nesting  of  aggrega¬ 
tions.  (The  problems  with  explicit  representation  of  accessors  and  constructors  as  Spread 
and  Make  nodes  were  discussed  in  more  detail  in  Section  3.4.2.) 

The  flow  graph  formalism  was  specifically  designed  to  allow  aggregation-equivalent  flow 
graphs  to  be  recognized.  Programs  are  represented  as  minimally-aggregated  flow  graphs, 
with  any  internal  residual  Spreads  and  Makes  replaced  with  attributed  fan-out  and  fan-in 
edges.  Chches  involving  aggregate  data  structures  are  expressed  in  grammar  rules  in  which 
the  aggregation  is  specified  in  the  embedding  relation.  The  cliches  are  then  recognized  in 
programs  by  using  the  embedding  relation  to  introduce  the  cliched  aggregation  organization 
into  the  parsing  process. 

In  Section  2.3.2,  two  organizational  variations  of  PiSia  arc  pointed  out  (in  Figures  2-10 
and  2-12).  In  one,  the  initialization  and  storage-requirements  computations  are  found  within 
Inject,  while  the  other  separates  these  computations  out  into  the  functions  Initialize- 
Sinulator  and  Conpute-Storage-Requiroients.  The  first  aggregates  four  pieces  of  data  into 
a  Message  data  structure  and  then  nests  this  inside  an  Event  data  structure,  along  with  a 
Time  part.  The  other  aggregates  three  pieces  of  data  into  a  Handler-Data  data  structure 
and  then  nests  it  inside  a  Msg  data  structure,  along  with  a  Destination  and  Arrival-Time 
part.  Both  aggregate  the  same  pieces  of  data,  but  using  different  nesting  organizations, 
ordering  of  parts,  and  names  for  structures  and  parts. 

However,  these  two  programs  have  the  same  basic  flow  graph  representation,  which  is 
shown  in  Figure  5-1.  The  only  difference  between  the  two  is  in  their  edge  attributes,  as 
shown  in  Figure  5-2.  (One  program,  Inject,  iteratively  calls  a  function  Execute-Iext-Event, 
while  the  other,  Start-Pisia,  calls  Process-lext-Nessage.  The  flow  graph  representations 
of  these  two  calls  is  the  same  for  both.  This  flow  graph  is  hidden  in  the  dotted  node  labeled 
“Execute-Next-Event.”  Likewise,  the  dotted  node  labeled  “Enqueue-Event”  represents  calls 
to  the  functions  Enquaue-Event  (by  Inject)  and  Enqueue-Message  (by  Start-Pisin),  which 
each  have  the  same  flow  graph  representation.  Also,  the  recursive  node  shown  in  Figure 
5-1  is  labeled  “Execute- Events,”  but  in  the  flow  graph  for  Start-Pisia,  the  recursive  node 
is  labeled  “Process-Messages.”  This  difference  is  not  significant,  since  the  recursive  nodes 
are  never  expected  to  match  any  right-hand  side  node  during  parsing.) 
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5.1.3  Delocalized  Cliches 

Using  the  flow  graph  representation  also  addresses  the  problem  that  parts  of  a  cliche  may 
be  scattered  throughout  the  text  of  a  program.  Many  cliches  become  much  more  localized 
in  the  flow  graph  than  in  the  program  text  because  only  essential  dataflow  relationships  are 
captured.  For  example,  in  Figure  2-13,  a  portion  of  the  CST  code  is  shown.  Even  though 
parts  of  a  simulation  cliche  are  separated  by  unrelated  expressions  in  the  source  text,  they 
are  translated  into  neighboring  nodes  in  the  flow  graph  representation  of  the  program.  This 
representation  is  shown  in  Figure  5-3.  The  nodes  that  are  unrelated  to  the  simulation  cliche 
are  shaded. 

5.1.4  Unrecognizable  Code 

GRASPR  is  able  to  recognize  cliches  despite  the  presence  of  unrecognizable  code  in  the  pro¬ 
gram.  This  is  partly  due  to  GRASPR’s  cliche  localization  abilities  which  helps  to  separate  the 
familiar  from  the  unfamiliar  parts  of  the  program.  The  cliched  sections  of  a  program  tend 
to  become  localized  in  sub-flow  graphs  of  the  program’s  flow  graph  representation. 

The  other  aspect  of  GRASPR’s  approach  that  makes  partial  recognition  possible  is  the 
bottom-up  parsing  strategy  it  uses.  It  recognizes  and  reports  low-level  cliches,  even  if  it 
cannot  reconstruct  the  higher  level  design  that  puts  them  together.  AH  non-terminals  are 
treated  as  start-types  of  the  grammar,  so  that  each  instance  of  any  non-terminal  is  reported. 

GRASPR  has  been  specifically  designed  to  solve  the  partial  program  recognition  problem, 
which  is  defined  in  Section  3.3.1:  Given  a  program  and  a  library  of  cliches,  find  all  instances 
of  the  cliches  in  the  program  (i.e.,  determine  which  cliches  are  in  the  program  and  their 
locations).  It  formulates  this  problem  in  terms  of  the  subgraph  parsing  problem,  which  is: 
Given  a  flow  graph  F  and  a  flow  graph  grammar  G,  find  aU  possible  parses  of  all  sub-flow 
graphs  of  F  that  are  in  the  language  of  G. 

In  other  words,  when  a  program  is  partiaUy  recognized,  one  or  more  sub-flow  graphs 
of  the  program’s  flow  graph  encoding  are  recognized  as  members  of  the  graph  grammar 
which  encodes  the  cliche  library.  It  follows  from  the  definition  of  a  sub-flow  graph,  that  it  is 
possible  to  ignore  portions  of  a  flow  graph  before  and  after  a  recognizable  sub-flow  graph, 
as  well  as  portions  that  fan  out  from  or  into  an  internal  port  in  the  sub-flow  graph. 

What  this  means  in  terms  of  partially  recognizing  programs  is  that  GRASPR  can  recognize 
a  cliche  in  the  presence  of  unrecognizable  code  or  code  that  belongs  to  other  cliches,  as  long 
as  the  cliche  is  localized  into  a  sub-flow  graph  of  the  program’s  flow  graph  representation. 
It  must  be  possible  to  separate  the  cliche  from  the  rest  of  the  flow  graph  by  disconnecting 
a  set  of  edges. 

GRASPR  is  able  to  ignore  unfamiliar  code  that  “surrounds”  a  cliche  (in  that  it  sends 
dataflow  to  it  and/or  receives  dataflow  from  it).  See  Figure  5-4b.  It  is  also  able  to  ignore 
unfamiliar  code  that  is  done  conditionally  (2issuming  that  the  control  flow  constraints  do 
not  require  co-occurrence  relations  to  hold  between  the  component  operations).  See  Figure 
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Figure  5-3:  Flow  graph  representing  the  CST  code  of  Figure  2-13. 

\ 
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Figure  5-4:  a)  Average  cliche,  b-c)  Some  cases  in  which  a  program  can  be  partially  recog¬ 
nized. 


5-4c. 

GRISPR  can  partially  recognize  a  program  that  not  only  has  unfamiliar  algorithmic  frag¬ 
ments,  but  also  has  data  structures  that  aggregate  unfamiliar  parts.  It  is  able  to  ignore 
computation  on  unfamiliar  parts  of  an  aggregate  data  structure.  This  is  a  direct  result  of 
the  parser’s  techniques  for  recognizing  aggregation-equivalent  flow  graphs,  as  described  in 
Sections  3.4.2  and  3.5.2.  These  techniques  allow  recognition  of  a  cliched  data  structure  in 
a  user-defined  data  structure  even  when  the  cliche  aggregates  only  a  subset  of  the  parts 
aggregated  by  the  user-defined  structure. 

For  example,  suppose  the  cliche  library  contained  a  cliche  called  Extract-Message,  which 
is  the  common  computation  of  looking  up  a  SYICH-IODE  in  an  ADDRESS-NAP,  given  an  integer 
index,  dequeuing  its  Buffer  part  and  updating  the  ADDRESS-NAP  so  that  the  integer  index 
points  to  the  new  SYICH-IODE.  The  rules  encoding  Extract- Message  and  the  Local-Buffer- 
Dequeue  cliche  it  contains  as  a  part  are  shown  in  Figure  5-5. 

This  cliche  is  found  in  the  program  shown  in  Figure  5-6  which  operates  on  a  user-defined 
nod®  data  structure.  The  nod®  consists  of  five  parts,  one  of  which  (Queue)  corresponds  to 
the  Buffer  part  of  a  SYICH-IODE.  The  value  of  *nod®8®  corresponds  to  the  ADDRESS-NAP.  In 
addition  to  performing  the  Extract- Message  operation,  this  program  increments  the  Busy- 
Count  part  of  the  new  nod®  created.  It  also  calls  proc®ss-m®8sag®  on  the  m8g  dequeued,  the 
ADDRESS-NAP,  and  ®8t®p-qa®u®®  (which  is  the  global  NESSAGE  buffer). 

GRASPR  partially  recognizes  the  nod®  data  structure  as  well  as  the  program  st®p.  The  flow 
graph  representation  of  8t®p  is  shown  in  Figure  5-7.  (The  dotted  node  labeled  “Dequeue” 
is  an  abbreviation  for  a  flow  graph  that  is  derived  by  the  FIFO-Dequeue  non-terminal.) 
The  destructuring  and  construction  of  the  user-defined  nod®  data  structure  is  represented 
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AOribute-Tnusfer Rules:  1.  ce  :=  (ce  (n>  Select-Term)) 

(P.8) 


Figure  5-5;  Rules  for  Extract-Message  and  Local-Buffer-Dequeue  cliche. 


(defun  step  (node-nr) 

(let*  ((node  (get-node  node-nr)) 

(q  (node-queue  node))) 

(multiple-value-bind  (msg  ne«-queue) 

(dequeue  q) 

(setq  node 

(make-node  : queue  nev-queue 

: objects  (node-objects  node) 

: contexts  (node-contexts  node) 

:busy-count  (1+  (node-busy-count  node)) 
:method-cache  (node-metbod-cache  node)))) 
(setq  *nodes*  (copy-replace-elt  node  node-nr  *nodes*)) 
(multiple-value-bind  (neu-nodes  nev-step-queue) 
(process-message  msg  *nodes*  *step-queue*) 

(setq  vnodes*  nev-nodes  *8tep-queue*  netr-step-queue))))) 


Figure  5-6:  Code  containing  a  partially  recognized  data  structure. 


2 


Figure  5-7:  Flow  graph  representation  for  step. 
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in  attributed  fan-out  and  fan-in  edges.  This  facilitates  the  separation  of  the  unfamiliar 
computation  (the  increment  of  the  node’s  Busy-Count)  from  the  familiar.  It  allows  GRASPR 
to  recognize  Extract-Message  by  parsing  the  sub-flow  graph  that  results  from  disconnecting 
the  shaded  portion  of  step’s  flow  graph  from  the  rest  of  the  flow  graph. 

5.1.5  Function-Sharing 

The  derivations  generated  for  programs  by  the  flow  graph  parser  do  not  have  to  be  strictly 
hierarchical.  This  means  that  GRASPR  is  able  to  recover  the  design  of  a  program,  even  when 
parts  of  the  implementation  of  two  distinct  abstract  operations  overlap  as  a  result  of  an 
optimization.  In  effect,  GRASPR  “undoes”  the  optimization. 

For  example,  in  Section  2.3.2,  Figures  2-19  and  2-21  show  two  programs  that  differ  only 
in  that  one  optimizes  the  other  by  enumerating  the  array  nodes  once  instead  of  twice.  The 
enumeration  is  shared  between  the  two  cliched  operations  of  advancing  each  node  in  nodes 
and  computing  the  average  length  of  their  Queue  parts. 

GRASPR  is  able  to  recognize  these  two  cliches  in  both  programs,  even  though  they  overlap 
in  one.  GRASPR  does  not  destructively  reduce  the  input  flow  graph  representing  the  program. 
It  allows  the  recognition  of  a  part  of  the  flow  graph  to  be  seen  as  part  of  more  than  one 
higher-level  cliche.  The  resulting  design  trees  share  a  sub-tree,  as  is  shown  in  Figure  2-22. 

5.1.6  Redundancy 

GRASPR  is  able  to  deal  with  variation  due  to  redundancy  which  occurs  when  some  part  of 
a  cliche  appears  more  than  once  in  the  same  instance  of  a  cliche.  There  are  two  types  of 
redundancy  that  we  encountered  in  dealing  with  real  programs. 

One  type  is  the  repetition  of  some  computation  on  the  same  set  of  inputs  and/or  produc¬ 
ing  outputs  that  are  conditionally  merged  into  the  same  consumer  operation.  An  example 
of  this  is  discussed  in  Section  2.3.2  and  shown  in  Figure  2-23.  In  this  example,  the  computa¬ 
tion  of  accessing  the  first  element  of  Bucket-List  using  car  is  performed  twice.  The  parser’s 
ability  to  recognize  share-equivalent  programs  allows  GRASPR  to  tolerate  the  variation  due 
to  this  type  of  redundancy.  In  particular,  the  parser  zips  up  the  flow  graph  representation 
of  the  program,  allowing  it  to  recognize  the  cliche  Ordered-Associative-List.  That  is,  it 
generates  an  alternative  view  of  the  program  in  which  the  redundancy  is  removed. 

The  second  type  of  redundancy  occurs  when  a  loop  is  unrolled  or,  more  generally,  a 
recursion  is  unfolded.  This  arises  in  our  example  programs  when  we  convert  the  original 
programs,  which  contain  destructive  operations  (causing  side  effects  to  mutable  data  struc¬ 
tures),  to  their  non-destructive  versions.  As  described  in  Section  4.2.2,  this  is  handled  by 
an  additional  chart  monitor  that  creates  an  alternative  view  in  which  the  recursion  is  folded 
back  up. 
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5.1.7  Implementation  Variation 


GRASPR  is  able  to  recognize  two  programs  that  perform  the  same  cliched  abstract  operation, 
even  though  they  may  use  two  different  implementations  of  that  operation.  This  is  because 
the  cliche  library  is  encoded  in  a  grammar  that  explicitly  captures  implementation  rela¬ 
tionships  between  the  cliches.  So  GRASPR  is  able  to  view  and  describe  structures  on  various 
levels  of  abstraction. 

This  enables  it  to  produce  the  same  high-level  description  of  the  two  versions  of  the  CST 
program  shown  in  Figures  2-16  and  2-17  of  Section  2.3.2,  even  though  they  differ  on  a  lower 
level  of  abstraction  in  their  implementation  of  the  global  message  queue.  GRASPR  produces 
the  design-trees  shown  in  Figures  2-14  and  2-18  for  the  two  versions.  They  differ  only  in 
the  subtrees  that  are  highlighted  by  dotted  boxes  in  Figure  2-18. 

It  is  impractical  to  enumerate  all  possible  implementational  variations  of  an  abstract 
cliche  in  the  cliche  library  as  flat  structures.  However,  the  hierarchical  organization  of  the 
cliche  library  allows  implementation  variation  to  be  represented  compactly. 

5.2  Limitations 

Our  recognition  approach  is  based  primarily  on  dataflow  graph  matching  and  control  flow 
constraint  checking.  The  success  of  this  approach  depends  on  being  able  to: 

1.  faithfully  capture  the  program’s  dataflow  in  our  flow  graph  representation  and  the 
program’s  control  flow  in  the  attributes,  and 

2.  express  a  programming  cliche  in  an  attributed  graph  grammar  rule  in  terms  of  its  data 
and  control  flow  constraints  (i.e.,  operation  types  and  arity,  dataflow  connections, 
control  environment  relationships). 

In  general,  the  limitations  of  our  approach  arise  when  one  or  both  of  these  are  not 
possible  to  do.  The  first  criterion  is  not  possible  when  the  dataflow  or  control  flow  of 
the  program  cannot  be  completely  captured  by  static  analysis  or  the  dataflow  is  not  made 
explicit  (in  that  it  is  derived  from  intermediate  computations).  The  second  criterion  is  not 
satisfied  for  cliches  that  have  loosely  constrained  data  and  control  flow  or  that  are  defined 
by  characteristics  other  than  data  and  control  flow. 

This  section  gives  specific  situations  in  which  we  encountered  these  limitations  in  ex¬ 
perimenting  with  the  recognition  of  our  example  programs.  It  also  suggests  ways  of  dealing 
with  these  problems,  e.g.,  by  collaborating  with  other  mechanisms  or  eliciting  and  accepting 
advice  '‘rom  a  person.  (There  are  additional  limitations  to  the  current  recognition  system 
that  represent  open  research  problems,  rather  than  inherent  difficulties  with  the  approach. 
These  are  discussed  in  Section  7.2.) 


5.2.1  Missing  or  Derived  Dataflow 

Our  cliches  are  basicaUy  expressed  as  dataflow  graphs.  A  cliche  can  be  recognized  only  if  a 
sub-flow  graph  of  the  flow  graph  representing  the  program  is  isomorphic  to  the  cliche’s  flow 
graph  representation.  Unfortunately,  sometimes  a  cliche  exists  in  a  program,  but  GRASPR 
fails  to  find  it  because  dataflow  links  are  derived  or  missing. 

The  principal  cause  of  missing  dataflow  (and  control  flow)  information  in  our  example 
simulator  programs  is  that  they  accept  functions  for  simulating  individual  machine  oper¬ 
ations  as  input.  This  prevents  data  and  control  flow  from  being  completely  determined 
statically. 

We  found  three  common  causes  of  derived  dataflow  links  in  our  example  programs.  One 
is  that  a  primary  part  of  a  cliched  data  structure  may  correspond  to  a  part  of  a  data 
structure  in  the  program  that  is  a  handle.  The  handle  is  used  to  look  up  the  piece  of  data 
that  actually  corresponds  to  the  cliche’s  primary  part.  For  example,  our  Execution-Context 
data  cliche  contains  a  sequence  of  IMSTRUCTIois  as  a  primary  part.  In  the  CST  program,  on 
the  other  hand,  the  corresponding  data  structure,  called  Context,  has  a  “Code”  part  that 
is  a  symbol.  This  symbol  is  used  to  look  up  a  Block,  which  is  a  sequence  of  IISTRUCTIOHs, 
in  a  pooling  structure  containing  all  existing  Blocks. 

The  problem  with  non-cliched  uses  of  handles  is  that  they  introduce  intermediate  com¬ 
putation  which  interrupts  data  flowing  from  one  primitive  operation  to  another.  This 
computation  looks  up  a  piece  of  data  using  a  handle  into  a  pooling  structure. 

Unsimplified  code  is  a  second  cause  of  obscured  dataflow  links.  For  example,  in 
(F  (Abs-val  (G  x))),  where  (G  x)  is  always  positive,  there  is  always  direct  dataflow  from 
G  to  F. 

A  third  cause  is  that  a  program  may  implicitly  aggregate  heterogeneous  pieces  of  data, 
rather  than  explicitly  aggregating  the  data  into  a  structure  with  named  parts,  using  a  struc¬ 
turing  primitive  (such  as  DEFSTRUCT  in  Common  Lisp).  In  implicit  aggregation,  a  primitive 
data  structure,  such  as  a  list  (in  Common  Lisp)  or  an  array,  is  used  to  aggregate  heteroge¬ 
neous  pieces  of  data,  where  the  position  in  the  data  structure  matters.  For  example,  PiSim 
creates  and  uses  an  array  whose  first  two  elements  cache  information  about  a  MESSAGE  (Type 
and  Storage- Requirements),  while  the  rest  of  the  array  holds  the  MESSAGE’S  Arguments.  This 
array  should  be  treated  as  an  aggregate  data  structure  with  three  parts:  Type  (a  symbol). 
Storage-Requirements  (an  integer),  and  Arguments  (an  array). 

Implicitly  aggregated  data  structures  are  accessed  and  constructed  with  primitive  op¬ 
erations  (such  as  aref )  on  the  data  structures  at  fixed  indices.  These  operations  are  not 
converted  to  attributed  edges,  as  are  selectors  and  constructors  for  explicit  aggregations. 

There  are  two  problems  with  this.  One  is  that  with  explicit  aggregation,  the  data 
from  one  operation  to  another  is  represented  as  a  direct  edge  annotated  with  accessor 
and  constructor  attributes,  but  with  implicit  aggregation,  this  dataflow  is  interrupted  by 
primitive  operations  that  access  or  update  at  a  fixed  index.  In  other  words,  the  explicit 
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dataflow  link  is  replaced  by  a  “derived”  dataflow  link. 

The  other  problem  is  that  it  loses  the  benefit  of  our  representation  for  explicit  aggre¬ 
gation  which  facilitates  the  separation  of  familiar  and  unfamiliar  computations  on  parts  of 
a  data  structure.  This  separation  allows  partial  recognition  of  the  data  structure  and  the 
computation  on  it.  (This  capability  is  discussed  in  Section  5.1.4.) 

The  underlying  difficulty  is  that  implicit  aggregation  hides  the  information  that  a  certain 
primitive  access  or  update  at  a  fixed  location  is  actually  a  selector  or  constructor  involving 
a  certain  data  structure  and  its  parts.  When  data  is  explicitly  aggregated  (e.g.,  using 
DEFSTRUCT),  the  structuring  primitive  serves  as  a  machine-readable  comment  that  specifies 
that  some  pieces  of  data  are  aggregated  and  are  only  accessed  and  constructed  using  certain 
functions.  It  also  provides  information  about  which  user-defined  data  structure  and  parts 
are  involved  in  the  selection  or  construction.  Additionally,  it  represents  the  intent  of  the 
programmer  to  only  use  these  accessors  and  constructors  to  manipulate  the  aggregation  and 
never  deal  with  it  directly  using  primitive  operations. 

(Note  that  people  find  it  hard  to  deal  with  implicit  aggregation  as  weU.  It  requires 
knowing  how  fixed  locations  in  the  data  structure  translate  to  the  particular  pieces  of  data 
being  aggregated.  It  requires  eflfort  to  perform  this  mapping  during  recognition.) 

Solution  Suggestions 

To  deal  with  the  variation  due  to  missing  or  derived  dataflow,  GRASPR  would  profit  from 
advice  from  a  user  or  collaboration  with  other  automated  techniques.  For  example,  classical 
rewriting  or  partial  evaluation  techniques  can  be  applied  to  simplify  parts  of  the  program. 
(See  Letovsky  [84]  and  Murray  [95],  for  example.)  By  interleaving  recognition  with  these 
other  techniques,  alternative  views  of  the  program  can  be  generated  to  facilitate  recognition. 
Recognition  in  turn  can  provide  a  more  abstract  view  of  the  program  and  generate  assertions 
about  parts  of  it,  based  on  the  known  properties  associated  with  the  cliches  that  have  been 
recognized  so  far. 

One  way  for  GRASPR  to  elicit  advice  is  by  looking  for  “question-triggering”  patterns 
(in  addition  to  cliches)  which  point  to  the  possibility  that  some  dataflow  is  derived.  For 
example,  by  looking  for  standard  look  up  and  update  operations  (such  as  associative-set 
cliches),  GRASPR  might  uncover  a  use  of  a  handle.  Recognizing  that  each  node  created  during 
initialization  is  put  into  ♦MODES*  triggers  asking  the  user  if  ♦MODES*  always  contains  all  the 
MODES  ever  created.  A  fixed-position  array  or  list  access  suggests  an  implicit  aggregation 
is  being  used.  These  hypotheses  can  then  be  presented  to  the  user  or  some  expectation- 
driven  component  for  confirmation.  Once  the  use  of  a  handle  or  an  implicit  aggregation  is 
uncovered,  GRASPR  can  generate  an  alternative  view  of  the  flow  graph  in  which  the  derived 
links  are  made  explicit  attributed  edges. 

It  can  be  more  difficult  for  GRASPR  to  confirm  its  hypotheses  on  its  own  than  for  a 
human  user  to  confirm  them,  since  the  user  can  take  advantage  of  expectations  generated 
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from  the  mnemonic  names  and  documentation.  For  example,  it  can  be  easy  for  a  person 
to  tell  whether  a  partictilar  data  structure  is  a  pooling  structure,  just  by  its  name:  <»lodes* 
contains  all  lode  data  structures  in  PiSim,  efilocks*  contains  all  Block  structures  in  CST. 
(Alternatively,  the  user  can  give  GRASPR  advice  about  which  structures  are  pooling  structures 
up  front,  without  waiting  for  GRASPR  to  ask  for  it). 

A  special  (and  common)  case  of  implicit  aggregation  for  which  it  is  easy  for  a  person 
to  give  advice  is  manual  abstraction.  In  this  case,  functions  are  explicitly  defined  which 
perform  the  accesses  and  constructions  involving  fixed  indices  in  an  implicitly  aggregated 
data  structure.  In  other  words,  the  programmer  manually  defines  the  accessor  and  con¬ 
structor  functions  for  an  implicitly  aggregated  data  structure.  (These  functions  are  defined 
automatically  by  explicit  aggregation  primitives  (such  as  DEFSTRUCT).) 

This  is  distinguished  from  general  implicit  aggregation  in  that  the  aggregation  is  ex¬ 
plicit  to  people,  even  though  it  “looks”  the  same  as  implicit  aggregation  to  GRASPR.  The 
aggregation  is  expressed  in  the  naming  conventions  the  manual  abstraction  functions  use. 
They  also  express  the  programmer’s  intent  not  to  violate  the  abstraction  by  manipulating 
the  aggregate  directly  using  primitive  operations.  Since  GRASPR  does  not  take  naming  con¬ 
ventions  into  account,  these  functions  are  flattened  just  like  any  other  function.  However, 
a  person  can  easily  give  GRASPR  the  information  that  certain  functions  should  be  seen  as 
accessors  and  constructors  for  an  aggregate  data  structure. 

5.2.2  “Missing”  Cliche  Parts 

Another  common  reason  for  an  algorithmic  cliche  not  to  be  recognized  is  because  part  of 
the  cliche  is  replaced  in  the  program  by  a  special-case  optimization.  This  optimization  is 
not  a  cliched  one;  it  happens  to  be  possible  in  the  context  in  which  the  cliche  is  used. 

A  common  instance  of  this  occurs  when  some  computation  is  avoided  by  using  a  value 
that  equals  the  result  of  that  computation.  This  can  be  an  opportune  equality  or  an 
intentionally  cached  value.  For  example,  the  cliche  for  polling  the  simulated  nodes  and 
stepping  those  that  have  work  to  do  contains  an  enumeration  of  the  collection  of  simulated 
nodes.  The  cliche  for  enumeration  when  the  coUection  is  implemented  as  a  sequence  has 
a  part  that  computes  the  size  of  the  sequence  and  then  uses  it  to  determine  how  many 
elements  to  enumerate.  The  instance  of  this  cliche  in  the  CST  code  does  not  compute  the 
size  of  ♦BODES*,  but  instead  uses  *BUMBER-BODES*  which  is  a  global  variable  specifying  the 
size  of  *HODES*.  This  variable  is  used  during  initialization  to  create  *H0DES*. 

Sometimes  part  of  a  cliche  is  missing  in  the  program  because  the  general  case  represented 
by  the  cliche  has  been  simplified  in  the  context  of  the  program.  For  example,  a  part  of  the 
Event- Driven  Simulation  cliche  is  a  Priority-Queue  Insert  which  adds  an  initial  EVEBT  to  the 
Event-Queue.  Because  the  Event-Queue  is  empty  at  this  point,  the  general  case  of  this  cliched 
operation  can  be  reduced  to  the  computation  done  when  the  priority  queue  is  empty.  (For 
example,  if  the  priority  queue  is  implemented  as  an  ordered  associative  list,  the  insertion 


would  simply  cons  the  event  onto  the  empty  priority  queue,  without  testing  whether  it  is 
empty  or  providing  actions  for  splicing  it  in  if  its  not  empty.)  If  the  special- case  version 
of  the  cliche  is  a  common  optimization,  then  it  is  included  in  the  library  along  with  the 
general  case.  However,  when  it  is  not,  recognition  of  the  cliche  fails.  (We  cannot  expect  all 
possible  optimizations  in  the  context  of  use  to  be  cliched  and  we  do  not  want  to  enumerate 
them  all  in  the  library.) 

Solution  Suggestions 

What  is  needed  for  recognition  to  succeed  in  these  cases  is  for  the  special-case  computation 
and  the  general-case  cliche  to  be  seen  as  equivalent.  In  general,  this  cannot  be  done. 
However,  it  may  be  possible  to  apply  limited  reasoning  techniques  to  uncover  dataflow 
equalities  or  conditional  simpliflcations  in  simple  cases  such  as  those  discussed  above. 

Non-cliched  special-purpose  optimizations  often  cause  some,  but  not  all  of  a  cliche  to  be 
recognized.  One  way  to  elicit  advice  on  whether  some  computation  is  a  special-case  opti¬ 
mization  is  to  find  maximally-sized  near-misses  (partial  recognitions)  of  the  cliche  and  then 
generate  a  hypothesis  that  the  cached  value  used  is  equal  to  the  result  of  the  computation 
in  the  part  of  the  cliche  not  yet  matched. 

Recognizing  maximaUy-sized  near-misses  is  costly  (as  is  discussed  in  Section  6.2.7). 
However,  we  can  generate  them  only  for  particular  cliches  and  at  particular  locations  in  the 
program  in  order  to  reduce  the  cost.  For  example,  we  can  choose  only  promising  cliches, 
such  as  those  for  which  some  salient  part  has  been  recognized,  and  we  can  look  for  them 
only  in  the  areas  of  the  program  that  have  not  already  been  recognized  as  part  of  other 
unrelated  cliches. 

5.2.3  Expressing  Cliches  with  Loose  Constraints 

In  encoding  cliches  as  constrained  dataflow  graphs  in  graph  grammar  rules  we  are  required  to 
specify  exactly  which  operations  (or  classes  of  operations)  make  up  a  cliche,  how  dataflow 
connects  them  to  each  other,  and  their  arity.  For  some  cliches  that  we  identified  in  our 
simulator  domain,  this  is  difficult  to  do. 

There  are  three  different  cases  in  which  we  encounter  difficulties.  One  is  in  expressing 
cliches  that  have  as  an  integral  part  the  application  of  an  arbitrary,  non-cliched  and  non¬ 
primitive  function.  A  second  case  is  in  compactly  representing  possible  variations  in  the 
implementation  of  an  algorithmic  cliche  whose  parts  may  be  combined  in  several  possible 
valid  configurations.  The  third  case  is  in  capturing  a  cliched  data  and  control  flow  pattern 
in  which  the  operations  and  tests  are  not  tightly  constrained  to  be  of  particular  types.  The 
dataflow  between  them  is  only  loosely  constrained  as  well. 
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Arbitrary  Function  Application 

We  encountered  two  examples  of  types  of  cliches  that  are  difficult  to  encode  because  a  part 
of  them  is  the  application  of  an  arbitrary  function.  They  are  second-order  patterns,  in  that 
they  are  parameterized  over  arbitrary  functions,  which  are  non-cliched  and  non-primitive. 

One  example  arises  in  encoding  iteration  cliches,  as  discussed  in  Section  4.1.3.  These 
cliches  all  contain  applications  of  arbitrary  functions  or  predicates  in  an  iteration.  However, 
we  cannot  encode  these  cliches  without  requiring  the  functions  or  predicates  to  be  primitive 
operations  (terminals)  or  cliched  functions  (non-terminals).  For  example,  it  is  not  possible 
to  recognize  the  generation  cliche  in  the  following  code. 

(defun  f  (1) 

(f  (cdr  (cdr  1)))) 

This  is  because  the  generating  function  is  an  arbitrary  composition  of  primitives  (i.e.,  the 
generating  function  is  (lambda  (x)  (cdr  (cdr  x))). 

Another  example  of  this  problem  arises  in  trying  to  capture  the  simulation  cliches  with¬ 
out  requiring  that  the  code  for  simulating  message  handling  be  cliched.  In  particular,  we 
wanted  to  express  the  cliche  for  processing  an  event  (in  event-driven  simulation)  or  ad¬ 
vancing  a  node  (in  synchronous  simulation)  as  having  a  part  that  applies  some  non-cliched 
message  handling  simulation  function. 

Solution  Suggestions 

What  is  needed  is  a  special-purpose  mechanism  (separate  from  the  graph  parser)  to  bundle 
up  the  sub-flow  graph  that  satisfies  certain  constraints.  This  mechanism  can  make  use  of 
information  about  how  much  of  the  cliche  has  already  been  matched  to  focus  on  certain 
locations.  It  can  also  make  use  of  information  available  in  the  cliche’s  constraints. 

For  example,  in  the  iteration  cliches,  the  input  and  output  correspondence  constraints 
place  restrictions  on  which  sub-flow  graph  can  be  bundled  up.  Waters  [138]  has  developed 
general-purpose  dataflow-based  techniques  for  decomposing  a  program  into  temporally  ab¬ 
stract  fragments.  It  would  be  useful  to  incorporate  these  decomposition  techniques  into 
the  recognition  process  to  help  bundle  up  possible  functions.  For  instance,  bundling  up  the 
composition  of  cdrs  in  our  example  above  can  be  done  by  grouping  together  the  sub-flow 
graph  that  is  bounded  by  input  and  output  ports  that  input-correspond. 

In  the  case  of  bundling  up  message  handling  simulation  code  when  no  cliched  function 
for  it  is  recognized  (as  in  CST),  it  might  be  possible  to  ask  for  advice  on  which  part  of  the 
program  achieves  this  purpose.  Also,  based  on  the  location  of  the  rest  of  the  cliche  and 
which  nearby  parts  of  the  program  are  unrecognizable,  GRASPR  might  be  able  to  hypothesize 
approximately  which  part  of  the  program  should  be  bundled  up. 
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Implementational  Variations 

As  we  mentioned  in  Section  2.1.3,  there  are  many  variations  of  our  synchronous  simulation 
algorithm.  On  each  iteration,  the  algorithm  we  described  performs  three  actions  in  the 
following  order:  test  for  termination,  deliver  messages,  and  poll  and  advance  nodes  by  one 
step.  The  other  variations  of  this  algorithm  in  which  a  different  ordering  is  used  also  perform 
synchronous  simulation. 

However,  each  of  these  variations  is  represented  by  a  different  dataflow  graph.  For 
example,  the  algorithm  described  in  Section  2.1.3  has  the  form  shown  in  Figure  5-8a.  (This 
is  a  sentential  form  of  our  current  grammar  which  encodes  the  algorithm.)  Two  other  valid 
configurations  are  shown  in  Figure  5-8b  and  5-8c.  In  fact,  all  six  permutations  of  the  three 
actions  are  valid  configurations. 

The  problem  is  that  we  must  deal  with  these  variations  by  enumerating  them  in  the 
cliche  library.  This  is  because  the  flow  graph  encoding  forces  us  to  specify  the  exact  dataflow 
connections  between  the  three  operations  and  therefore  a  particular  ordering. 

It  is  an  open  question  whether  there  is  a  more  compact  representation  for  algorithmic 
cliches  that  vary  in  this  way.  (For  example,  reasoning  about  a  program’s  functional  seman¬ 
tics,  as  is  done  by  Allemang’s  DUDU  [4,  5],  may  help  tolerate  this  variation.)  In  addition, 
more  experience  with  encoding  cliches  is  needed  to  tell  how  severe  this  problem  is  and  how 
frequently  it  occurs  in  practice. 

General  Data  and  Control  Flow  Pattern 

Because  our  formalism  forces  us  to  specify  many  details  of  dataflow,  operation  types,  etc., 
it  is  sometimes  hard  to  express  some  common  data  and  control  flow  patterns  that  are  not 
tightly  constrained.  One  cliche  we  had  difficulty  expressing  is  a  common  type  of  conditional 
dispatch  which  occurs  in  program  interpreters  (particularly  for  the  Lisp-like  languages). 

This  cliche  is  the  “Evaluate”  part  of  an  EVALUITE/APPLY  recursion  for  interpreting  state¬ 
ments  in  a  language.  The  standard  algorithm  for  this  dispatches  on  the  type  of  an  expression 
to  code  for  handling  that  expression.  For  some  expression  types,  there  are  standard  com¬ 
putations  to  perform.  For  example,  for  expressions  that  are  constants,  the  expression  is 
simply  returned.  For  expressions  that  are  applications  of  some  operator  to  a  set  of  argu¬ 
ments  (which  are  themselves  expressions),  each  argument  is  recursively  evaluated  and  the 
operation  is  applied  to  the  set  of  evaluated  arguments. 

However,  instances  of  this  cliche  vary  with  the  types  of  expressions  that  can  be  evaluated, 
which  depends  on  the  language  of  the  program  being  interpreted.  The  number  and  type  of 
test  cases  in  the  conditional  dispatch  vary.  The  actions  that  are  dispatched  to  also  vary. 
The  dataflow  connection  constraints  are  flexible.  The  problem  is  that  in  our  formalism,  we 
must  specify  the  number  and  types  of  tests  and  actions,  and  the  exact  dataflow  between 
them.  A  more  abstract  language  for  expressing  abstract  data  and  control  flow  patterns  is 
needed. 
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The  point  of  this  section  and  the  previous  is  that  although  the  flow  graph  formalism 
allows  us  to  encode  cliches  on  a  high  level  of  abstraction,  the  level  of  abstraction  is  still 
limited  by  the  amount  of  detail  that  must  be  specified.  Perhaps  there  are  ways  of  com¬ 
bining  this  formalism  with  even  more  abstract  formalisms  that  will  allow  looser  dataflow 
constraints.  For  example,  perhaps  we  can  encode  and  recognize  parts  of  cliches  within  the 
dataflow  graph  formalism,  and  then  use  a  different  encoding  to  express  constraints  on  how 
these  parts  fit  together. 

5.2.4  Enqueuing  New  Messages  and  Events 

This  section  deals  with  a  problem  that  arises  both  as  a  result  of  not  being  able  to  fully 
determine  the  data  and  control  flow  of  the  example  programs  and  of  not  being  able  to 
express  and  efficiently  check  certain  constraints. 

As  mentioned  in  Section  4.1.4,  one  of  the  actions  of  a  processing  node  that  is  simulated 
as  part  of  the  simulation  of  message  handling  is  the  creation  and  sending  of  new  messages. 
One  of  the  constraints  on  both  simulation  algorithms  is  that  whenever  a  message  send  is 
simulated,  a  new  EVEIT  or  MESSAGE  must  be  created  and  added  to  the  event-queue  or  global 
message  buffer,  respectively. 

We  did  not  include  this  constraint  in  the  grammar  rule  encoding  of  the  rules  for  the 
synchronous  and  event-driven  simulation  cliches.  There  are  three  obstacles  to  expressing 
and  checking  this  constraint  within  our  graph  parsing  framework. 

One  is  that  the  computation  involved  (enqueuing  new  EVEITs  or  MESSAGES)  is  buried 
within  the  code  for  simtilating  a  processing  node’s  action.  This  code  is  not  guaranteed  to 
be  cliched,  so  we  do  not  have  grammar  rules  that  derive  all  possible  flow  graphs  representing 
this  code.  This  means  that  we  have  no  context  in  which  to  express  the  constraint. 

Suppose  it  is  cliched,  we  still  have  a  second  problem  which  is  that  the  part  of  the 
simulation  code  that  performs  the  activity  of  enqueuing  new  EVEITs  (or  MESSAGES)  is  typically 
given  as  input  to  the  simulator.  So,  it  is  not  available  for  analysis.  The  cliche  models  the 
application  of  functions  for  simulating  a  processing  node’s  actions  during  an  instruction 
execution.  Since  these  functions  are  not  part  of  what  is  analyzed,  the  exact  data  and 
control  flow  connecting  th?  enqueuing  operation  to  the  rest  of  the  cliche  are  not  explicitly 
represented. 

Finally,  suppose  we  had  the  code  available.  That  is,  rather  than  accepting  functions 
to  simulate  the  actions  of  a  processing  node  in  executing  some  machine  operation,  suppose 
the  simulator  program  contains  a  large  conditional  which  dispatches  on  machine  operation 
types  to  the  code  simulating  operation  execution.  We  encounter  yet  a  third  problem  which 
is  that  in  the  current  parsing  framework,  it  is  difficult  to  express  and  check  the  constraint 
that  each  time  a  message  send  is  simulated,  -  i.e.,  a  new  EVEIT  (or  MESSAGE)  is  created,  -  the 
new  EVEIT  (or  MESSAGE)  is  added  to  the  event-queue  (or  global  message  buffer).  It  requires 
expressing  and  checking  constraints  that  are  quantified  over  instances  of  some  computation. 


183 


A  special-purpose  global  mechanism  is  needed  to  check  this  constraint,  since  the  parser 
is  currently  only  able  to  check  constraints  on  individual  instances.  In  addition,  it  requires 
some  means  of  finding  all  instances  of  creating  whatever  user-defined  data  structure  that 
corresponds  to  our  cliched  aggregate  EVEIT  (or  NESSiGE).  This  requires  unambiguous  infor¬ 
mation  about  the  mapping  from  cliched  data  structures  to  user-defined  ones.  Also,  since 
aggregate  data  structure  creation  is  encoded  in  edge  attributes,  finding  the  instances  of 
user-defined  data  structure  creation  cannot  be  done  by  recognizing  a  flow  graph.  Instead  it 
must  focus  on  patterns  in  edge  attributes. 

In  summary,  problems  arise  when; 

•  an  integral  part  of  cliche  is  non-cliched  and  the  constraint  we  want  to  express  refers 
to  this  non-cliched  part, 

•  the  data  and  control  flow  relating  the  constrained  part  of  the  cliche  to  the  rest  of  the 
cliche  are  not  completely  and  statically  determined  (e.g.,  because  part  of  the  program 
is  read  in  as  input),  or 

•  the  constraint  quantifies  over  instances  of  some  computation,  particularly  if  the  com¬ 
putation  is  a  data  structure  creation  or  access,  not  the  application  of  some  primitive 
operations. 

Solution  Suggestions 

Although  the  enqueuing  constraint  is  difficult  to  express  and  check  within  the  current  graph 
parsing  framework,  it  is  not  a  hard  constraint  for  a  person  to  check.  The  person  has 
the  advantages  of  understanding  mnemonic  names  which  give  clues  about  the  purposes  of 
machine  operations.  A  person  might  also  have  expectations  about  which  machine  operations 
cause  message  sends,  based  on  knowledge  of  the  machine  being  simulated. 

Rather  than  requiring  that  more  code  be  given  to  GRASPR  for  analysis  or  extending  the 
parser  to  quantify  constraints  over  instances,  it  might  be  easier  to  just  ask  the  user  whether 
the  constraint  holds.  The  constraint  should  be  expressed  more  generally  as  a  condition  on 
the  code  that  simulates  a  node’s  action.  If  we  are  already  eliciting  advice  on  which  part 
of  the  program  handles  a  message  (ais  suggested  in  Section  5.2.3),  then  we  could  also  aisk 
whether  this  general  constraint  holds.  GRASPR  might  also  ask  for  the  simulator  function  that 
is  called  to  perform  the  enqueuing  and  then  can  analyze  that  code  to  understand  better 
how  the  event-queue  (or  global  message  buffer)  is  implemented. 

5.2.5  Modifications  to  Example  Programs 

To  enable  GRASPR  to  recognize  the  example  simulator  programs,  we  made  the  following 
changes  to  the  programs.  Some  avoid  the  inherent  limitations  of  the  graph  parsing  approach 
discussed  in  this  section.  Others  help  GRASPR  deal  with  difficulties  in  the  current  system, 
which  we  expect  to  be  addressed  by  extensions  to  GRASPR  in  the  future.  (For  example. 
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these  include  recognizing  programs  that  are  multiply- recursive  or  that  perform  side  effects 
to  mutable  objects.  See  Section  7.2).  Appendix  B  contains  the  original  versions  of  the  two 
simulator  programs,  as  well  as  their  translations. 

•  We  transited  instances  of  implicit  aggregation  (including  manual  abstractions)  to 
explicit  aggregations.  For  example,  we  defined  a  Task-Segment  data  structure  in  PiSim 
to  explicitly  aggregate  the  Type,  Storage- Requirements,  and  Arguments  of  a  MESSAGE. 
In  CST,  we  replaced  the  manual  abstraction  for  msg  with  a  msg  structure  definition. 

e  We  simplified  conditionals  and  canonicalized  conditions  involving  lOT,  OR,  and  AID. 
(See  step-done  and  enqueue  in  CST,  for  example.) 

e  We  manually  undid  special-case  (noncliched)  optimizations  that  take  advantage  of  an 
opportune  dataflow  equality  or  a  cached  value.  That  is,  we  restored  the  computational 
part  of  a  cliche  that  is  avoided  by  an  optimization.  For  example,  in  CST’s  step-nodes 
function,  which  enumerates  and  steps  the  simulated  nodes,  the  use  of  '^number-nodes* 
ip  replaced  by  a  call  to  array-total-size. 

e  To  deal  with  the  problem  of  encoding  and  recognizing  loosely  constrained  cliches,  we 
provided  advice  to  GRASPR  about  where  these  cliches  were  located.  (In  a  future  hybrid 
system,  we  expect  this  advice  to  come  from  other  recognition  techniques  that  can  deal 
with  these  types  of  cliches.  See  Section  7.2.2.)  During  the  translation  of  the  PiSim 
program  to  a  plan,  we  advised  the  symbolic  evaluator  that  the  box  representing  the 
call  to  the  function  Evaluate  not  be  expanded.  This  avoids  a  limitation  in  the  current 
implementation  of  GRASPR  which  prevents  it  from  translating  multiply-recursive  pro¬ 
grams  into  meaningful  attributed  flow  graphs.  (See  Section  7.2.1.)  We  also  specified 
that  the  unexpanded  call  to  Evaluate  is  an  instance  of  the  “Evaluate”  cliche.  (See 
Section  7.2.2.)  Similarly,  during  the  translation  of  the  CST  program,  we  specified  that 
the  procoss-msg  function  not  be  expanded  and  that  it  represents  an  instance  of  the 
Handle-Message  non-terminal. 

When  the  symbolic  evaluator  creates  the  plan  representation  of  a  program  (which  is 
then  translated  to  an  attributed  flow  graph),  it  starts  with  some  topmost  function 
and  recursively  expands  calls  to  user-defined  functions  into  their  plan  represen  ations. 
Only  plans  for  functions  whose  calls  are  reached  by  the  evaluator  are  included  in  the 
plan  representation.  This  means  the  flow  graphs  for  some  functions  in  the  example 
programs  are  not  included  as  sub-flow  graphs  of  the  input  graph  parsed.  In  particular, 
those  that  are  only  called  by  Evaluate  in  PiSim  and  process-msg  (or  its  subfunctions) 
in  CST  are  not  included.  Also,  functions  in  PiSim  called  by  the  Machine-Operation 
functions  given  as  input  to  PiSim  cannot  be  expanded  into  the  program's  plan  repre¬ 
sentation.  In  addition,  some  logging  and  tracing  functions  in  both  programs  are  not 
expanded. 
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•  We  translated  the  programs  into  their  functional  versions  by  replacing  destructive 
operations  with  their  non-destructive  counterparts.  (See  Section  7.2.4  for  ideas  on 
partially  automating  this  translation.) 

•  All  iterative  computations  are  treated  as  tail-recursions  by  GRiSPR.  Currently,  the 
translation  from  iterative  to  tail-recursive  procedures  is  done  manually,  but  it  is  well- 
known  that  this  translation  is  straightforward  to  automate. 

•  Program  breaks,  errors,  and  non-local  program  exits  are  currently  ignored  in  that 
they  are  treated  as  ordinary  calls  to  primitive  operations.  The  non-local  control  flow 
they  cause  is  not  modeled  in  our  control  flow  attributes.  Further  research  is  needed 
to  determine  how  best  to  model  non-local  flow.  See  [117],  Section  3.4,  for  further 
discussion  of  this  problem. 

5.2.6  Conclusion 

We  have  made  observations  of  difficulties  encountered  in  recognizing  two  programs.  These 
might  be  relatively  rare  problems  or  they  might  be  common.  There  is  currently  no  natural 
partitioning  of  programs  based  on  the  difficult  features  they  contain  with  respect  to  recogni¬ 
tion.  This  report  starts  to  point  out  some  features  that  might  distinguish  programs  that  are 
hard  to  recognize  from  others  (at  least  within  the  realm  of  recognition  bzised  on  dataflow 
and  control  flow).  Much  more  research  is  needed  to  map  out  this  space  of  recognition 
difficulty. 
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Chapter  6 


Analysis 


Our  flow  graph  parsing  algorithm  is  worst-case  exponential  in  both  space  and  time.  For 
each  rule  of  the  grammar,  the  parser  is  searching  for  a  way  to  match  each  node  of  the 
rule’s  right-hand  side  to  an  instance  of  the  node’s  type  in  the  input  graph.  This  search  is 
inherently  exponential.  In  fact,  the  flow  graph  recognition  problem  for  flow  graphs  -  given 
a  flow  graph  F  and  a  grammar  G,  determine  whether  or  not  F  is  in  the  language  of  G 
-  is  NP-complete.  (Appendix  A  gives  one  proof  of  the  NP-completeness  of  this  problem.) 
The  flow  graph  recognition  problem  is  simpler  than  the  flow  graph  parsing  problem  for  flow 
graphs,  so  it  is  unlikely  that  there  is  a  flow  graph  parsing  algorithm  that  is  not  exponential 
in  the  worst  case. 

Nevertheless,  we  apply  our  flow  graph  parsing  algorithm  to  the  problem  of  partial  recog¬ 
nition  of  programs  and  do  not  encounter  the  exponential  behavior  in  practice.  The  reason 
is  that  we  take  advantage  of  constraints  specific  to  the  program  domain  which  are  strong 
enough  to  reduce  the  complexity  and  prevent  the  worst  case  from  happening.  (The  appli¬ 
cation  of  the  parser  to  other  problem  domains  requires  similar  use  of  strong  constraints.) 

Efficiency  is  also  gained  by  using  a  graph  grammar  that  captures  much  of  the  common¬ 
ality  among  the  flow  graphs  the  parser  is  searching  for.  This  enables  the  parser  to  reuse 
results  of  exploring  parts  of  the  search  space. 

This  chapter  gives  an  expression  for  the  time  requirements  of  the  parser,  showing  that 
they  depend  on  the  number  of  full  partial  analyses  the  parser  generates.  It  points  out 
how  the  algorithm  can  be  made  to  exhibit  exponential  behavior  in  the  worst  case.  It  then 
explains  how  constraints  make  it  feasible  for  us  to  apply  this  inherently  exponential  process 
to  practical  program  recognition.  Weak  constraints  can  arise  in  the  general  flow  graph 
parsing  case  in  the  form  of  ambiguity  and  disconnected  right-hand  sides  of  graph  grammar 
rules.  However,  additional  program  domain-specific  constraints  compensate  for  these  weak 
structural  constraints. 

Empirical  evidence  supports  these  arguments  and  shows  the  effectiveness  of  the  con¬ 
straints  used.  The  empirical  results  were  obtained  by  experimenting  with  the  recognition  of 
the  two  example  simulator  programs,  referred  to  as  CST  and  PISIM.  (These  programs  have 
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been  modified  from  their  original  form  (see  Section  5.2.5)  to  get  around  the  limitations  of 
the  current  system  that  are  discussed  in  Sections  5.2  and  7.2.  Even  with  these  modifica¬ 
tions,  the  programs  provide  a  realistic  base  for  experimentation  in  that  the  modifications 
did  not  significantly  affect  the  strength  of  constraints.)  Further  experimentation  on  more 
programs  is  needed  to  broaden  our  understanding  of  which  constraints  are  crucial  and  which 
programs  are  inherently  difficult  to  understand. 

This  chapter  concludes  with  a  few  suggestions  for  improving  the  performance  of  the 
parser. 


6.1  Cost 

This  section  presents  an  expression  for  the  time  requirements  of  the  parsing  and  constraint 
check'ng  process  which  is  at  the  heart  of  the  recognition  system.  We  first  briefly  describe 
the  particular  instantiation  of  the  general  chart  parsing  algorithm,  which  is  used  by  the 
recognition  system.  The  instantiation  fixes  the  rule  invocation  strategy  to  be  bottom-up. 
(This  is  the  strategy  used  by  the  current  recognition  system  for  reasons  described  in  Section 
3.5.  The  top-down  version  of  the  algorithm  for  grammars  with  a  simple  embedding  relation, 
which  encodes  no  aggregation  relationships,  is  equivalent  to  Brotsky’s  graph  parsing  algo¬ 
rithm.  See  [15],  for  an  analysis.  For  the  top-down  string  parsing  case,  see  Earley’s  analysis 
[31,  32].) 

We  derive  a  formula  for  the  average-case  complexity  of  the  bottom-up  algorithm.  The 
cost  depends  on  the  number  of  items  that  are  created  by  the  parser.  Section  6.2  characterizes 
this  number  and  shows  how  the  worst-case  exponential  growth  in  the  number  of  items  is 
prevented  by  domain-specific  constraints  in  practice. 

In  the  complexity  expression,  the  numbers  of  various  types  of  items  created  by  the  parser 
are  weighted  by  the  costs  of  the  parser’s  actions.  Section  6.3  gives  detail;  of  what  the  costs 
of  these  actions  depend  upon 

6.1.1  Brief  Algorithm  Description 

For  the  purposes  of  our  analysis,  we  need  to  describe  a  few  additional  details  about  the 
structure  of  items  and  graph  grammars,  so  that  we  can  refer  to  them. 

Each  rule  in  the  grammar  has  an  associated  node  ordering.  This  is  a  reflexive,  anti¬ 
symmetric  relation,  that  need  not  be  transitive.  We  denote  it  as  <„.  We  distinguish  node 
orderings  in  which  aU  nodes  are  related  in  a  chain,  as  strict  node  orderings.  In  these,  there 
is  exactly  one  minimal  node  nj  (i.e.,  no  other  node  is  <„  ni)  and  exactly  one  maximal 
node  Uk  (i.e.,  Uk  is  not  <„  any  other  node),  all  of  the  nodes  are  ordered  from  ni  to  Uk  in  a 
sequence  (ni,...,/^)  such  that  n,  <„  n,+i  for  i  =  1,  and  no  other  pair  of  nodes  is 

related  besides  these.  (The  transitive  closure  of  a  strict  node  ordering  is  a  total  ordering.) 
We  call  non-strict  node  orderings  partial  node  orderings.  The  transitive  closure  of  a  partial 
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node  ordering  is  a  partial  ordering. 

We  call  the  node  type  that  an  item  is  recognizing  its  label.  Each  partial  item  has  a 
grammar  rule  associated  with  it  which  is  being  used  to  recognize  this  node  type.  Also,  each 
partial  item  contains  a  set  of  needed  nodes  which  ar  nodes  not  yet  matched  in  the  item 
rule’s  right-hand  side.  We  distinguish  a  subset  of  these  as  immediately  needed.  This  subset 
is  determined  by  the  rule’s  node  ordering.  Initially,  the  immediately  needed  nodes  are  the 
minimal  nodes.  When  a  node  x  is  matched,  it  is  replaced  in  the  immediately  needed  set 
by  all  other  nodes  not  yet  matched  that  x  is  less  than  in  the  ordering.  (If  a  partial  item’s 
rule  ha.s  a  strict  node  ordering,  the  item  wiD  always  have  exactly  one  immediately  needed 
node.) 

The  immediately  needed  set  determines  which  nodes  are  allowed  to  be  matched  next. 
If  a  complete  item  for  node-type  A  is  added  to  the  chart,  only  partial  items  that  have 
immediately  needed  nodes  of  type  A  can  be  extended  by  the  complete  item.  Similarly,  if  a 
partial  item  is  added  to  the  chart,  it  is  only  combined  with  complete  items  for  those  nodes 
in  its  immediately  needed  set. 

Each  item  has  a  set  of  input  and  output  mappings  which  specify  the  location  of  the  node¬ 
type  being  recognized.  For  partial  items,  these  might  be  empty.  The  location  is  specified  in 
the  form  of  a  set  of  mappings  of  ports  on  a  node  (whose  type  is  the  item’s  label)  to  sets  of 
location  pointers  (which  may  be  nested  due  to  aggregation,  as  described  in  Section  3.4.1). 
Each  location  pointer  specifies  some  input  graph  edge. 

We  are  now  ready  to  describe  the  chart  parsing  algorithm  which  uses  a  bottom-up  rule 
invocation  strategy. 

1.  Initialization: 

•  Add  complete  items  to  the  agenda  for  each  input  graph  node.  The  label  of  each 
item  is  the  node  label  of  the  input  graph  node  it  represents. 

•  For  each  rule,  add  an  empty  partial  item  to  the  agenda.  The  label  of  the  item  is 
the  node-type  of  the  rule’s  left-hand  side.  Make  the  item  immediately  need  the 
set  of  nodes  that  are  minimal  in  the  rule’s  right-hand  side  node  ordering.' 

2.  Until  the  agenda  is  empty,  continually  pull  an  item  X  from  the  agenda  and  if  X  is  not 
a  member  of  the  chart,  do  the  following: 

•  Add  X  to  the  chart. 

•  If  X  is  a  complete  item  and  X’s  constraints  are  satisfied,  then  for  each  partial 
item  P  in  the  chart  that  is  extendable  by  X,  make  a  new  item  extending  P  with 
X  and  put  it  on  the  agenda. 

'One  or  the  other,  but  not  both,  of  these  initialization  steps  can  add  the  items  to  the  chart  as  an 
optimization.  Also,  the  empty  partial  items  can  be  added  to  the  agenda  as  they  are  needed,  as  described  in 
Section  3.5.  To  simplify  the  analysis,  neither  optimization  b  done  here. 
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•  If  X  is  a  partial  item,  then  for  each  complete  item  C  in  the  chart  that  can  extend 
X,  make  a  new  item  extending  X  with  C  and  put  it  on  the  agenda. 

•  Apply  the  tests  and  operations  of  the  additional  monitors  to  the  item.  For 
example,  for  each  complete  item  X  whose  constraints  are  satisfied,  the  zip-up 
monitor  determines  whether  there  are  items  that  can  zip  up  with  X.  If  so,  it 
performs  the  zip-ups  and  adds  the  results  to  the  agenda. 

To  clarify,  the  check  that  “X  is  not  a  member  of  the  chart”  is  checking  that  there  is  no 
item  in  the  chart  that  represents  the  same  analysis  as  X.  If  X  is  partial,  then  this  checks 
that  there  is  no  other  partial  item  that  matches  the  same  right-hand  side  nodes  of  some  rule 
to  the  same  input  graph  terminal  nodes  or  non-terminal  instances.  If  X  is  complete,  then 
this  checks  that  there  is  no  other  complete  item  with  the  same  label  at  the  same  location 
as  X. 

There  are  two  situations  in  which  an  item  can  be  created  that  is  a  duplicate  of  an 
existing  item.  One  occurs  when  there  is  structural  ambiguity  (i.e.,  there  is  more  than  one 
way  to  derive  the  same  flow  graph  from  the  same  non-terminal). 

The  other  situation  occurs  when  two  complete  or  partial  items  are  created  as  a  result 
of  a  series  of  extensions,  starting  from  the  same  partial  item  and  involving  the  same  set  of 
complete  items  for  the  same  right-hand  side  nodes,  but  occurring  in  two  different  orders. 

Figure  6-1  gives  an  example.  The  partial  item  Ip  immediately  needs  two  nodes,  nj  of 
type  A  and  of  type  B.  Two  complete  items  are  formed,  one  for  A  and  the  other  for 
B,  such  that  both  can  extend  Ip.  Ip  is  extended  to  two  new  items  and  /p2.  Since  the 
complete  items  for  A  and  B  are  compatible  in  that  they  satisfy  the  binary  constraints  that 
Ip's  rule  imposes  on  Ri  and  n2,  Ipi  and  Ip2  are  extended  with  the  complete  item  for  B 
and  A,  respectively.  The  two  resulting  items  are  duplicates  of  each  other,  since  they  have 
the  same  right-hand  side  nodes  (ni  and  712)  matched  to  the  same  non-terminal  instances 
(represented  by  the  complete  items  for  A  and  B). 

This  can  only  happen  if  a  partial  item  is  able  to  have  more  than  one  immediately  needed 
right-hand  side  node.  Therefore,  it  occurs  only  when  a  rule  has  a  partial  node  ordering. 

Each  complete  and  partial  analysis  created  by  the  parser  is  added  to  the  chart  exactly 
once.  This  is  guaranteed  because  before  adding  an  item  to  the  chart,  the  parser  explicitly 
checks  for  a  duplicate  item  already  existing  in  the  chart. 

A  grammar  that  is  structurally  ambiguous  provides  multiple  ways  to  hierarchically  view 
a  subgraph.  The  multiple  derivations  are  sometimes  useful  for  understanding  purposes. 
So,  rather  than  simply  throwing  away  duplicate  complete  items  that  represent  different 
derivations,  we  can  store  them  in  an  auxiliary  structure  to  be  accessed  when  presenting  the 
parser’s  results. 

Another  clarification  of  the  algorithm  concerns  the  timing  of  constraint  checking.  Gram¬ 
mar  rules  place  a  number  of  constraints  on  the  nodes  and  edges  that  match  their  right-hand 
sides.  Some  of  these  constraints  are  checked  in  the  extendibility  criterion  (e.g.,  node  type 


190 


Figure  6-1;  Two  series  of  extensions  resulting  in  duplicate  items. 
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and  edge  connection  constraints).  Others  (e.g.,  most  attribute  conditions)  are  checked  when 
a  complete  item  is  added  to  the  chart,  before  it  is  paired  up  with  partial  items  to  extend. 
Section  6.2.2  discusses  the  design  decision  concerning  which  constraints  should  be  checked 
in  the  extendibility  criterion  and  which  should  be  postponed  to  apply  to  complete  items 
alone. 

Additional  details  of  this  algorithm  will  be  fleshed  out  as  needed.  In  particular,  many  of 
the  details  that  are  relevant  to  the  actions  of  the  parser,  such  as  adding  items  to  or  looking 
up  items  in  the  chart,  have  not  been  presented.  These  will  be  described  when  the  cost  of 
each  of  these  actions  is  considered. 

6.1.2  Complexity 

We  can  determine  the  cost  of  the  parsing  algorithm  by  considering  the  cost  of  each  of  its 
sub-operations  and  how  often  they  are  performed  (i.e.,  the  total  number  of  items  they  act 
upon).  To  do  this,  it  is  useful  to  categorize  the  types  of  items  created.  We  partition  the 
full  set  of  items  ever  created,  denoted  by  /y,  in  two  ways.  As  shown  in  Figure  6- 2a,  one 
partitioning  views  It  as  consisting  of  four  disjoint  sets  of  items  which  are  differentiated  by 
how  the  items  in  the  sets  were  created.  (The  relative  sizes  of  the  sets  in  the  figure  is  not 
meant  to  reflect  the  relative  sizes  of  the  actual  item  sets.) 

•  In  is  the  set  of  complete  items  created  during  initialization  for  each  of  the  terminal 
nodes  of  the  input  graph. 

•  In  is  the  set  of  empty  partial  items  created  during  initialization  for  each  rule. 

•  Iz  is  the  set  of  items  created  by  zipping  up  two  or  more  items. 

•  Ie  contains  all  items  created  by  extension. 

The  second  partitioning  breaks  up  It  into  two  disjoint  sets,  as  shown  in  Figure  6-2b: 

•  Id  is  the  subset  of  Ie  that  contains  duplicate  items  that  were  created  but  not  added 
to  the  chart,  and 

•  Ic  is  the  set  of  items  that  are  in  the  chart. 

Figure  6-2c  shows  how  the  sets  overlap  across  partitionings.  We  denote  as  //  the  subset 
of  items  in  the  chart  which  are  complete  items.  //  is  shown  in  Figure  6-2c  as  the  shaded 
portion. 

We  can  now  characterize  the  overall  cost  of  the  parsing  algorithm  by  considering  the 
number  of  times  each  of  the  actions  of  the  parser  is  applied.  This  can  be  expressed  in  terms 
of  the  sizes  of  the  various  sets  of  items  described  above.  This  is  because  each  action  of  the 
parser  acts  upon  a  particular  type  of  item  and  it  is  applied  exactly  once  for  each  item  of 
that  type.  There  are  no  additional  costs  not  accounted  for.  The  overall  cost  is  a  sum  of  the 
action  costs  weighted  by  the  number  of  items  to  which  they  apply. 
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lased  on  how  iteois  are  created. 


hip  between  the  partitions. 

Partitions  of  the  total  item  set. 
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We  consider  which  actions  are  applied  to  each  of  the  items  in  each  type  of  item  set. 
Each  action  is  followed  by  a  variable  denoting  the  run-time  cost  of  performing  this  action 
on  an  item.  These  variables  are  used  below  in  expressing  the  algorithm’s  complexity. 

The  following  actions  are  taken  upon  each  item  ever  created,  whether  or  not  it  is  added 
to  the  chart  (i.e.,  for  all  /  €  It)- 

•  create  it,  which  is  one  of  these  actions 

-  if  7  e  /n,  create  complete  item  for  a  terminal  node  (Cinatantiate-terminal) 

-  if  7  e  Ir,  instantiate  empty  partial  item  (Cinstantiau-empty) 

-  if  7  €  Iei  create  item  by  extension  (Cextend) 

-  I  ^  Izi  create  item  by  zipping  up  other  items  (Czip-up) 

•  add  it  to  the  agenda  {Cagenda-add) 

•  pull  it  from  the  agenda  {C agenda-retrieve) 

•  look  for  a  duplicate  of  it  (C duplicate-teat)- 

Each  item  added  to  the  chart  (i.e.,  each  item  in  Ic)  additionally  has  the  following  actions 
applied  to  it.  (For  now,  assume  the  only  additional  monitor  is  the  zip-up  monitor.) 

•  add  it  to  the  chart  {C chart-add), 

•  look  up  items  to  combine  with  it  (C combination-lookup), 

•  look  up  items  to  zip  up  with  it  (Czip-up-iookup)- 

Each  complete  item  in  the  chart  (i.e.,  those  in  If)  has  its  constraints  checked  (Cconatraint-check)- 
The  totsd  run-time  cost  of  this  algorithm,  in  terms  of  the  component  action  costs  and 
the  size  of  the  item  sets  is: 


\It\  *  iC agenda— add  "1"  C agenda— retrieve  "I"  ^duplicate— teat)  "I" 

\Ie\  *  (^extend  "t" 

\Ic  \  *  (Cchart  —add  "I”  (^combination— lookup)  "1" 

*  ('instantiate- empty  "I" 

|7n|  *  Cinatantiate-terminal 

1 7^  I  *  C zip— up  "t" 

\I j\  *  (Cconatrainta— check  “b  C zip— up— lookup) 


The  sizes  of  the  component  action  costs  are  typically  quite  small.  They  depend  polyno- 
mially  upon  the  sizes  of  various  parts  of  an  item,  such  as  the  number  of  inputs  or  outputs. 
These  costs  are  detailed  in  Section  6.3,  where  empirical  averages  are  also  presented. 
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In  a  typical  recognition  run,  the  dominant  terms  in  the  complexity  formula  are  the  first 
three.  Ie  is  typically  the  largest  of  the  item  sets  in  the  first  partitioning.  Ic  is  the  largest  in 
the  second  partitioning.  It  usually  consists  mostly  of  items  that  were  created  by  extension 
as  opposed  to  instantiation  or  zip-up  (i.e.,  a  majority  of  Ic  overlaps  with  Ie)- 

The  run-time  space  requirements  of  the  parser  also  depend  on  the  number  of  items 
created  by  the  parser.  The  space  cost  is  Od/xl). 

6.2  Counting  Items 

The  algorithm’s  complexity  (both  time  and  space)  depends  on  how  much  is  recognized. 
This  is  a  feature  of  the  algorithm  and  is  a  consequence  of  the  bottom-up  rule  invocation 
strategy  used  by  the  parser.  The  amount  recognized  can  be  measured  by  the  number  of 
items  the  parser  creates,  since  each  represents  a  partial  or  complete  recognition  of  some 
sub-flow  graph. 

This  section  focuses  primarily  on  characterizing  the  number  of  items  that  are  created 
by  the  parser  through  extension.  In  practice,  more  items  are  created  by  extension  than 
by  instantiation  or  zip-up.  Its  size  dominates  the  space  cost,  and  the  run-time  cost  of 
operations  over  this  set  dominates  the  parser’s  time  complexity. 

To  simplify  the  presentation,  we  temporarily  assume  that  no  items  are  created  by  zip¬ 
ping  up  items.  In  this  way,  we  avoid  cluttering  the  discussion  with  details  about  zip-ups 
which  might  be  irrelevant  to  other  applications  of  the  graph  parser  besides  program  recog¬ 
nition,  which  do  not  require  parsing  structure-sharing  graph  grammars.  In  Section  6.2.6, 
we  consider  the  effect  of  zip-ups  on  the  total  item  count. 

We  also  simplify  the  discussion  by  assuming  for  now  that  the  nodes  of  each  rule’s  right- 
hand  side  are  matched  according  to  a  strict  node  ordering.  One  effect  of  enforcing  a  strict 
node  ordering  is  that  the  parser  does  not  generate  duplicate  items  representing  the  same 
analysis.  That  is,  each  item  created  by  extension  is  unique  in  that  there  is  no  other  item 
for  the  same  rule  R  which  has  the  same  matches  for  each  of  R's  right-hand  side  nodes. 

To  see  this,  suppose  an  item  /j  were  created  for  which  there  is  a  duplicate  item  I-i- 
The  two  items  would  have  to  be  created  through  a  series  of  extensions  involving  the  same 
complete  items  for  the  same  right-hand  side  nodes,  but  the  extensions  would  have  to  occur 
in  different  orders.  This  is  because  each  partial  and  complete  item  is  added  to  the  chart  at 
most  once  and  they  are  combined  with  each  other  only  once  -  when  the  second  of  the  two 
is  added  to  the  chart.  So,  the  same  partial  item  cannot  be  extended  more  than  once  by  the 
same  complete  item  for  the  same  node.  Since  the  series  of  extensions  must  have  occurred 
in  different  orders,  some  partial  item  must  have  been  extended  with  complete  items  for 
more  than  one  right-hand  side  node.  This  can  only  happen  to  a  partial  item  that  has  more 
than  one  immediately  needed  node,  which  can  only  occur  when  partial  node  orderings  are 
being  used.  Therefore,  with  strict  node  orderings,  no  duplicate  items  representing  the  same 
analysis  wiU  be  created. 
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Another  effect  of  using  a  strict  node  ordering  is  that  fewer  partial  items  are  created. 
By  the  argument  just  given,  strict  node  orderings  permit  only  one  possible  series  of  partial 
items  leading  to  a  complete  item  through  extension.  Partial  node  orderings  may  aUow 
several  series  of  extensions,  each  involving  a  different  set  of  partial  items. 

The  reason  we  consider  the  case  of  using  strict  node  orderings  first  is  that  this  makes 
it  easier  to  see  the  effect  of  constraints  on  reducing  the  parser’s  search.  We  want  to  study 
the  growth  in  the  number  of  items  for  a  particular  rule  as  the  size  of  the  items  increases. 
This  growth  is  affected  by  two  things:  the  constraints  that  are  acting  on  the  right-hand 
side  nodes  matched  so  far  and  the  number  of  immediately  needed  nodes  an  item  can  have. 
Strict  node  orderings  force  the  number  of  immediately  needed  nodes  of  any  partial  item  to 
be  exactly  one.  So,  imposing  a  strict  node  ordering  on  all  rules  allows  us  to  study  the  effect 
of  constraints  on  the  growth  of  the  number  of  items,  independent  of  the  effect  of  multiple 
immediately  needed  nodes. 

Another  reason  we  make  this  simplification  is  that  parsing  using  a  strict  node  ordering 
is  one  of  the  ways  in  which  this  parser  is  expected  to  be  used.  It  is  more  efficient  than 
parsing  with  partial  node  orderings  since,  in  general,  it  allows  fewer  partial  items  to  be 
created.  (String  chart  parsing  is  a  general  case  in  which  strict  node  ordering  is  typically 
used,  where  the  “nodes”  are  string  symbols.) 

The  analysis  of  the  algorithm  when  partial  node  orderings  are  being  used  is  an  extension 
of  the  analysis  of  this  simplified  form.  This  is  given  in  Section  6.2.7,  where  the  advantages 
of  using  strict  versus  partial  node  orderings  are  also  discussed. 

The  organization  of  this  section  is  centered  around  the  characterization  of  the  number 
of  items  generated  for  a  single  rule  through  extension.  The  total  number  of  items  created  by 
extension  is  the  sum  of  this  number  over  all  the  rules  of  the  grammar.  Section  6.2.1  defines 
item  trees,  which  relate  the  items  created  by  the  parser  in  matching  a  rule’s  right-hand  side. 
Sections  6.2.2  and  6.2.3  discuss  the  effect  that  constraints  and  the  grammar  have  on  the 
growth  of  these  trees.  Empirical  observations  of  the  shape  of  item  trees  (i.e.,  the  growth  of 
the  number  of  items)  created  in  two  typical  recognition  runs  are  given  in  Section  6.2.4.  In 
Section  6.2.5,  we  borrow  a  theoretical  model  presented  by  Crimson  [49,  50]  in  his  analysis 
of  the  constrained  search  object  recognition  technique,  which  is  similar  to  the  sub-flow 
graph  matching  subprocess  performed  by  our  parser.  The  model  helps  us  to  understand 
the  role  of  constraints  and  suggests  future  research  into  ways  of  concretely  mezisuring  their 
eflfectiveness  for  a  particular  input  flow  graph  and  grammar.  The  final  two  sections  (6.2.6 
and  6.2.7)  lift  the  two  simplifying  2issumptions  of  suppressing  zip-ups  and  using  only  strict 
node  orderings  and  discuss  the  effects  this  has  on  the  parser’s  complexity. 

6.2.1  Item  Trees 

For  each  rule,  the  parser  searches  for  a  match  of  the  rule’s  right-hand  side  nodes,  such  that 
the  rule’s  constraints  hold.  Each  right-hand  side  node  is  matched  to  some  terminal  node  or 
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some  non-terminal  instance  that  has  been  found  in  the  input  graph.  The  rule's  constraints 
are  unary  (such  as  node  type  constraints)  or  binary  (such  as  edge  connection  constraints). 
The  items  for  a  rule  R  represent  each  of  the  stages  in  this  search.  The  size  of  an  item  is 
the  number  of  right-hand  side  nodes  of  the  item’s  rule  it  has  matched  so  far.  The  number 
of  items  created  is  an  indication  of  the  amount  of  search  the  parser  is  doing. 

The  items  for  a  rule  R  can  be  viewed  as  vertices  of  an  item  tree.  The  root  of  the  tree  is 
the  empty  item  for  R.  An  item  is  the  child  of  another  item  (called  the  parent)  iff  the  parent 
was  extended  to  the  child  during  parsing. 

A  parent  item  can  be  extended  to  two  children  items  if  more  than  one  instance  of 
some  right-hand  side  node  type  is  found  in  the  input  graph  and  these  instances  satisfy  the 
constraints  imposed  by  the  item’s  rule  with  respect  to  the  matches  of  other  nodes  that  have 
been  made  so  far.  (With  partial  node  orderings,  additional  children  are  generated  if  an  item 
has  more  than  one  immediately  needed  node,  as  is  discussed  in  Section  6.2.7.) 

The  growth  in  the  number  of  items  that  are  created  by  extension  can  be  modeled  by 
these  item  trees.  In  the  worst  case,  the  number  of  items  at  the  fringe  of  an  item  tree  for 
a  given  rule  R  can  be  exponential  in  the  number  of  nodes  in  iJ’s  right-hand  side,  k.  In 
particular,  if  each  node  in  the  right-hand  side  can  be  matched  to  m  instances  of  its  node 
type,  then  the  number  of  possible  complete  items  (of  size  k)  is  m*  and  the  total  number  of 
items  created  in  recognizing  R's  right-hand  side  is  J2i=o 

Furthermore,  in  general,  m  can  be  much  worse  than  linear  in  the  number  of  nodes  of 
the  input  graph  because  of  the  recursive  nature  of  the  matching  process  in  parsing.  Each 
of  the  complete  items  at  the  fringe  of  an  item  tree  for  a  rule  R  represent  instances  of  /2’s 
left-hand  side  node  type.  Since  there  can  be  an  exponential  number  of  them,  m  can  be 
exponential.  In  the  worst  case,  this  exponential  can  build  up  as  higher-level  non-terminals 
are  recognized.  (Assuming  the  grammar  contains  no  cycles,  we  define  the  height  of  a  node 
type  recursively  as:  the  height  of  a  terminal  type  is  0  and  the  height  of  non-terminal  type 
A  is  one  plus  the  maximum  of  the  heights  of  all  node  types  on  the  right-hand  sides  of  the 
rules  for  A.) 

As  the  worst  case,  suppose  the  following.  Ail  rules  have  right-hand  sides  of  size  k.  Each 
non- terminal  has  only  one  rule  for  it.  Each  right-hand  side  has  either  only  terminals  or  only 
non-terminals.  Each  terminal  node  can  match  n  input  graph  nodes.  Each  non-terminal 
in  the  same  right-hand  side  is  at  the  same  height  in  the  grammar.  Then,  the  number  of 
complete  items  for  a  non-terminal  at  height  h  is 

6.2.2  Constraints  Prune  Item  TVees 

It  would  be  crazy  to  use  this  inherently  exponential  algorithm  for  program  recognition 
if  it  were  not  that,  in  practice,  constraints  prune  item  trees  considerably.  For  example, 
node  type  constraints  alone  are  able  to  reduce  the  branching  factor,  which  is  the  base  of  the 
exponential.  In  the  program  examples,  there  is  a  variety  of  terminal  and  non-terminal  node- 
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types,  with  a  fairly  flat  distribution  of  instances.  In  CST,  the  average  number  of  instances 
of  each  node  type  is  3.6,  with  a  median  of  2.  In  PISIM,  the  average  is  3.7,  with  median  2. 

The  exponential  build-up  of  the  number  of  instances  of  non-terminals  as  their  height 
increases  is  not  typically  encountered,  either.  The  number  of  instances  of  non-terminals  is 
usually  small  and  decreases  as  their  height  in  the  grammar  increases.  The  reason  is  that 
the  recognition  of  high-level  non-terminals  requires  more  constraints  to  be  satisfied  than  for 
low-level  non-terminals. 

The  worst-case  exponential  behavior  of  the  parser  is  only  encountered  if  the  constraints 
imposed  by  the  grammar  rules  are  weak.  This  section  explores  the  constraints  used  in 
applying  the  graph  parser  to  program  recognition  and  describes  their  effect  on  the  growth 
of  item  trees  in  terms  of  empirical  observations. 

A  complete  item  for  a  non-terminal  A  is  one  in  which  for  some  rule  for  A,  all  the  rule’s 
right-hand  side  nodes  are  matched  to  input  graph  nodes  or  non-terminal  instances,  such 
that  the  rule’s  unary  and  binary  constraints  are  satisfied.  The  unary  constraints  are  the 
node-type  constraints  that  each  node  in  the  right-hand  side  imposes  on  the  nodes  matched 
with  it.  The  binary  constraints  are  the  following: 

•  Edge  connection  constraints  between  pairs  of  ports  on  nodes.  (These  include  the 
constraints  on  aggregation  organization  discussed  in  Section  3.5.2.) 

•  Attribute  conditions,  which  are  binary  relations  on  the  attributes  of  nodes  and  edges. 

•  Port  precedence  restrictions,  which  are  constraints  on  the  edges  in  an  input  graph  that 
can  be  mapped  to  the  ports  of  a  non-terminal.  In  particular,  a  transitive,  irreflexive, 
and  antisymmetric  relation  precedes  imposes  an  ordering  on  the  ports  in  the  input 
graph.  The  source  of  each  edge  precedes  the  sink  of  the  edge  and  the  input  ports  of 
each  node  precede  each  of  the  node’s  output  ports.  The  port  precedence  constraint 
is  that  no  two  input  (or  output)  ports  on  a  non- terminal  can  be  mapped  to  a  pair  of 
input  graph  edges  in  which  the  sink  of  one  precedes  the  source  of  the  other. 

The  port  precedence  restrictions  are  used  to  avoid  cyclic  reductions,  such  as  the  one 
shown  in  Figure  6-3.  The  non-terminal  A’s  top  input  port  is  mapped  to  the  input  graph 
edge  with  location  pointer  12  coming  into  6,  while  A’s  bottom  input  port  maps  to  the  edge 
with  location  pointer  15  coming  from  a.  This  is  illegal,  since  6’s  input  prec-^des  a’s  output. 
The  reason  cyclic  reductions  are  prevented  is  that  they  are  unnecessary: 

•  flow  graphs  are  acyclic, 

•  all  sentential  forms  of  a  flow  graph  grammar  are  acyclic  (i.e.,  you  cannot  derive  a  flow 
graph  that  is  cyclic), 

•  a  reduction  step  that  creates  a  cyclic  graph  cannot  be  the  inverse  of  any  valid  deriva¬ 
tion  step,  so  the  cyclic  graph  will  not  be  reduced  further. 
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a)  A  simple  grammar. 


b)  An  input  graph. 


15 


c)  A  cyclic  reduction. 

Figure  6-3:  Grammar  and  input  graph  leading  to  an  illegal,  cyclic  reduction. 

Cyclic  reductions  do  not  cause  any  problems.  They  simply  result  in  dead-end  items  that 
are  not  used  by  anyone.  We  avoid  them  simply  because  they  waste  time  and  space.  This 
restriction  can  be  lifted  if  a  cyclic  reduction  is  a  useful  interpretation  to  report  and  the  flow 
graph  formalism  is  extended  to  include  cycles. 

Some  of  these  unary  and  binary  constraints  are  applied  incrementally  to  each  partial 
item  as  the  complete  match  is  being  built  up.  Since  these  are  interleaved  with  the  matching 
process,  we  refer  to  them  as  match-interleaved  constraints.  They  are  applied  as  soon  as  the 
portions  of  the  right-hand  side  to  which  they  refer  are  matched.  These  constraints  are  part 
of  the  extendibiUty  criterion. 

Other  constraints  are  postponed  until  the  ma  's  complete  (i.e.,  all  nodes  and  edges 
of  the  right-hand  side  are  paired  with  nodes  and  edges  of  the  input  graph).  These  are 
interleaved  with  the  parsing  process  and  are  referred  to  as  parse-interleaved  constraints. 

The  decision  about  whether  to  match-interleave  or  parse-interleave  a  particular  con¬ 
straint  depends  on  its  effectiveness  in  pruning  the  search,  the  cost  of  applying  it,  and 
its  degree  of  applicability.  Ideally,  the  match-interleaved  constraint  should  be  satisfied 
by  relatively  few  matches,  be  inexpensive  to  check,  and  apply  to  most  nodes  or  pairs  of 
nodes.  The  current  recognition  system  match-interleaves  node-type,  edge  connection,  co¬ 
occurrence,  and  port  precedence  constraints.  All  attribute  conditions  besides  co-occurrence 
constraints,  are  parse- interleaved.  This  section  discusses  how  this  decision  was  made  and 


199 


node-type 

number  of  instances 

aref 

6 

mod 

4 

Increment-or-  Decrement 

12 

Decrement 

3 

Table  6.1;  Number  of  instances  of  CIS-Extract’s  node  types. 

describes  the  impact  that  match-interleaving  of  these  constraints  has  on  the  complexity  of 
matching  right-hand  sides  in  the  two  example  simulator  programs. 

We  are  not  only  trying  to  show  the  advantages  of  match-interleaving  some  constraints 
versus  parse-interleaving  them.  (The  advantages  are  obvious.)  We  are  mainly  trying  to  show 
the  effect  that  various  constraints  have  on  the  complexity.  The  case  in  which  a  constraint  is 
parse-interleaved  is  simply  a  base-line  to  which  to  compare  the  case  in  which  the  constraint 
is  match-interleaved.  The  improvement  is  a  measure  of  the  effectiveness  of  that  constraint. 

For  most  rules,  node  type  and  edge  connection  constraints  are  strong.  The  strength  of 
a  node-type  constraint  depends  on  the  number  of  instances  of  that  node-type  in  the  input 
graph.  Since  the  distribution  of  node  types  is  fairly  flat  in  the  flow  graphs  representing 
the  two  example  programs,  the  node  type  constraint  can  usually  significantly  reduce  the 
number  of  possible  matchings  between  right-hand  side  nodes  and  node  type  instances  in 
the  input  graph. 

The  strength  of  an  edge  connection  constraint  depends  on  the  number  of  edges  in  the 
input  graph.  If  this  number  is  low,  then  few  pairs  of  incorrect  matches  between  nodes  will 
satisfy  the  constraint.  The  flow  graphs  representing  the  two  example  programs  had  sparse 
edge  sets.  The  average  degree  of  the  ports  in  CST  is  1.3,  with  a  median  of  1.  In  PISIM,  the 
average  degree  is  1.5,  with  a  median  of  1. 

However,  there  is  a  class  of  rules  for  which  node  type  and  edge  connection  constraints  are 
weak.  In  particular,  in  rules  representing  cliched  operations  on  aggregate  data  structures, 
the  right-hand  side  graph  is  usually  made  up  of  disconnected  nodes.  The  operations  on  ajr- 
gregate  data  structures  tend  to  be  implemented  using  a  set  of  less  abstract  operations  that 
act  on  the  parts  of  the  structure  independently.  In  addition,  many  of  the  aggregate  opera¬ 
tions  are  implemented  by  primitive  operations  that  are  relatively  common  in  the  program 
(e.g.,  +),  as  weU  as  being  common  among  the  aggregate  operations. 

The  plan  for  Circular-Indexed  Sequence  Extract  is  an  example  (see  Figure  6-4).  The 
rule  encoding  a  plan  like  this  imposes  few  structural  constraints,  since  it  has  few  edges 
between  its  nodes.  It  also  contains  nodes  that  are  of  relatively  common  node  types.  Table 
6.1  shows  the  distribution  of  number  of  instances  over  these  node  types. 

If  no  other  constraints  are  interleaved  with  the  matching  process,  a  combinatorial  ex¬ 
plosion  occurs  in  the  number  of  items  created  in  recognizing  CIS-Extract.  Figure  6-5  shows 
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Old;  Circular-Indexed-Sequence 


Update-First: 

IK^^BSBSSKm 

Increment/Decrement 

New:  Circular-Indexed-Sequence 


ClS-Extract 

Figure  6-4:  The  plan  for  extracting  from  a  Circular-Indexed  Sequence. 
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Figure  6-5;  Bushy  item  tree  produced  in  recognizing  CIS-Extract  with  weak  match- 
interleaved  constraints. 

the  bushy  item  tree  created  for  CIS-Extract  in  this  case.  The  items  of  size  1  are  those 
created  in  extending  the  initial  empty  partial  item  with  the  complete  items  representing 
three  instances  of  Decrement.  Each  of  these  are  then  extended  with  the  six  complete  items 
for  the  AREF  terminal  nodes,  yielding  18  items.  Each  of  these  is  extended  by  the  12  complete 
items  for  Inc-or-Dec,  yielding  216  items.  Finally,  the  parser  extends  these  with  each  of  the 
four  complete  items  for  MOD  for  which  the  edge  connection  constraint  is  satisfied. 

This  shows  how  a  lack  of  strong  match-interleaved  constraints  causes  the  number  of 
partial  items  to  build  up  exponentially.  In  fact,  flow  graph  parsing  with  a  flow  graph 
grammar  whose  rules  impose  no  edge  connection  constraints  or  any  other  binary  constraint 
is  NP-complete.  Appendix  A  shows  that  the  problem  of  recognizing  unordered  context- 
free  grammars  (UCFG)  can  be  reduced  to  flow  graph  parsing.  UCFGs  are  context-free  string 
grammars  in  which  the  symbols  in  the  right-hand  side  string  are  considered  unordered.  (For 
example,  given  a  UCFG  containing  the  rule  S  — »  xyz,  S  can  be  recognized  in  the  strings  xyz, 
yxz,  zyx,  etc.) 

Fortunately,  in  applying  the  flow  graph  parser  to  program  recognition,  other  constraints 
can  be  interleaved  with  the  matching  process  to  prune  item  trees  early.  These  are  the  co¬ 
occurrence  and  port  precedence  constraints.  (As  described  in  Section  4.1.1,  if  two  nodes  in 
a  right-hand  side  are  constrained  to  co-occur,  then  they  must  match  nodes  that  represent 
operations  in  the  same  control-environment.) 

The  precedence  relation  constraint  enforces  the  condition  that  the  data  structure  oper¬ 
ation  must  cut  across  slices  of  dataflow,  rather  than  allowing  the  disconnected  pieces  of  the 
operation  to  be  recognized  vertically  in  the  same  slice.  See  Figure  6-6.  Cyclic  reduction 
avoidance  prevents  B  from  being  recognized  in  the  rightmost  graph. 

The  advantage  of  match-interleaving  these  constraints  can  be  seen  by  contrasting  the 
parser's  performance  when  match-interleaving  the  constraints  to  its  performance  when  these 
constraints  are  parse-interleaved.  In  the  parse-interleaving  case,  item  trees  for  data  structure 
operations  are  extremely  bushy  and  can  be  exponential  in  the  worst  case.  Most  of  the  items 
at  the  leaves  are  killed  by  the  co-occurrence  and  port  precedence  constraints  when  they 
are  finally  applied.  For  example,  the  item  tree  for  CIS-Extract,  shown  in  Figure  6-5,  has 
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A  legal  reduction. 


6-6:  The  restriction  on 


0 
1 
2 

3 

4 

Figure  6-7:  Skinny  item  tree  produced  in  recognizing  CIS-Extract  with  strong  match- 
interleaved  constraints. 

372  items  at  height  4,  but  only  3  of  these  satisfy  the  co-occurrence  and  port  precedence 
constraints. 

With  match-interleaving,  the  items  trees  are  much  shorter  and  skinnier,  since  the  co¬ 
occurrence  constraints  are  applied  as  early  as  possible.  Figure  6-7  shows  the  item  tree  for 
CIS-Extract.  As  soon  as  the  Decrement  node  is  matched,  the  matches  of  all  the  other  nodes 
are  disambiguated  to  involve  only  nodes  in  the  same  control  environment. 

The  influence  that  match-interleaving  co-occurrence  constraints  has  on  reducing  the 
parser’s  search  can  also  be  seen  by  contrasting  the  parser’s  time  and  space  requirements 
when  match-interleaving  is  performed  versus  when  parse- interleaving  is  used.  We  do  the 
same  in  order  to  study  the  influence  of  match-interleaved  port  precedence  constraints.  This 
helps  us  evaluate  the  effectiveness  of  each  constraint  in  reducing  the  overall  complexity  of 
the  parser  and  it  allows  us  to  compare  the  relative  effectiveness  of  the  two  constraints. 

Figure  6-8  shows  the  results  of  running  the  CST  example  under  the  following  four 
conditions:  a)  parse-interleave  both  constraints,  b)  match-interleave  co-occurrence,  parse- 
interleave  port  precedence,  c)  parse-interleave  co-occurrence,  match-interleave  port  prece¬ 
dence,  and  d)  match-interleave  both.^  In  Figure  6-8,  the  number  of  items  created  by  the 
parser  is  shown  as  the  number  of  items  of  three  different  types.  “Successful”  items  are  com¬ 
plete  items  which  satisfy  all  their  rules’  constraints.  “Killed”  items  are  complete  or  partial 
items  that  have  failed  their  rules’  constraints.  “Extendable”  items  are  partial  items  that 
have  not  yet  failed  any  match-interleaved  constraints  and  may  be  extended  with  complete 
items  for  their  immediately  needed  nodes.  (The  relationship  between  these  sets  and  the 
sets  of  complete  and  partial  items  is  shown  in  Figure  6-9.) 

The  number  of  successful  items  remains  the  same  over  all  the  cases,  as  it  should.  The 
effect  of  the  two  constraints  can  be  seen  in  the  total  number  of  killed  and  extendable 
items,  which  is  reduced  by  more  than  70%  (from  2235  to  662)  by  match  interleaving  both 
constraints.  This  has  the  effect  of  dramatically  speeding  up  the  parser  -  when  match- 

^The  run  times  for  the  experiments  in  this  chapter  were  obtained  by  running  the  recognition  system  on 
a  Sparc  2  in  Lucid.  These  statistics  were  collected  with  zip-up  creation  being  performed,  since  zip-ups  are 
needed  to  recognize  the  simulator  cliche.  However,  the  number  of  zip-ups  created  in  these  runs  is  relatively 
small,  as  is  discussed  in  Section  6.2.6. 
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a)  Parse-Interleave  Both 
Time:  201  seconds 

Successful:  329 
Killed:  1432  ^  ^3^ 

Extendable:  803  f 


b)  Match- Interleave  Co-occur, 

c)  Parse-Interleave  Co-occur; 

Parse-Interleave  Precedence 

Match-Interleave  Precedence 

'Ilffle;  86  seconds 

Hme:  190  seconds 

Successful:  329 

Successful:  329 

Killed:  505 

Extendable:  244  \ 

Killed:  1230 

Extendable:  736  ^ 

d)  Match- Interleave  Both 

Time:  86  seconds 

Successful:  329 

Killed:  446 'i _ 

S  662 

Extendable:  216  \ 

Figure  6-8:  Results  of  running  CST  example  with  constraints  parse-interleaved  versus  match- 
interleaved. 


Figure  6-9:  Relationship  of  the  sets  of  successful,  kiUed,  and  extendable  item  sets  to  the 
sets  of  complete  and  partial  items. 
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a)  Parse- Interleave  Both 
Tune:  179  seconds 

Successful;  436 
Killed;  774  "S 
Extendable;  339  ^ 


c)  Parse- Interleave  Co-occur; 
Match-Interleave  Precedence 
Time;  173  seconds 
Successful;  436 

^^1010 
Extendable;  328  \ 

d)  Match- Interleave  Both 
'Ilffle;  148  seconds 

Successful;  436 
Killed;  525 

Extendable;  263  j 

Figure  6-10:  Results  of  running  PISIM  example  with  constraints  parse-interleaved  versus 
match-interleaved. 

interleaving  both  constraints,  the  parser  is  133%  faster  than  when  parse-interleaving  them.^ 
This  is  because  partial  items  are  killed  earlier.  Only  12%  of  the  killed  items  had  less  than  half 
of  their  rules’  right-hand  sides  matched  when  the  two  constraints  were  parse-interleaved. 
However,  when  the  constraints  were  match-interleaved,  53%  of  the  killed  items  had  less 
than  half  of  their  rules’  right-hand  sides  matched.  This  causes  fewer  extendable  items  to 
be  created,  and  therefore  fewer  killed  items  as  well. 

Most  of  the  savings  are  the  result  of  match-interleaving  co-occurrence  constraints  which 
reduces  the  number  of  killed  and  extendable  items  by  66%  (from  2235  to  749).  Port  prece¬ 
dence  constraints  have  a  more  modest  effect,  reducing  this  number  by  only  12%  (from  2235 
to  1966). 

In  the  PISIN  example,  match-interleaving  has  a  less  dramatic  impact  than  in  the  CST 
example,  but  it  still  helps,  as  can  be  seen  in  Figure  6-10.  Match-interleaving  both  constraints 
reduces  the  killed  and  extendable  item  count  by  30%  (from  1113  to  778).  This  is  simply 
because  the  rules  used  in  recognizing  the  cliches  in  PISIM  bad  strong  node  type  and  edge 
connection  constraints  with  respect  to  the  input  graph  representing  the  PISIN  program. 
There  was  not  as  much  need  to  rely  on  co-occurrence  or  port  precedence  constraints  to 
prune  the  search. 

As  in  the  CST  example,  match-interleaving  co-occurrence  constraints  had  more  of  an 

^Performance  is  the  reciprocal  of  execution  time,  so  performance  increase  n  (as  in  “X  is  n%  faster  than 
Y”)  is  computed  from  the  relationship;  1  +  ^5=  (See  Hennessy  and  Patterson, 

Section  1.2  [57].) 


b)  Match-Interleave  Co-occur; 
Parse- Interleave  Precedence 
Time;  161  seconds 

Successful;  436 
KUled;  572 

Extendable;  263 
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effect  than  match-interleaving  port  precedence  constraints.  Match-interleaved  co-occurrence 
checking  reduces  the  number  of  killed  and  extendable  items  by  25%  (from  1113  to  835), 
while  match-interleaved  port  precedence  checking  only  reduced  the  number  by  9%  (from 
1113  to  1010). 

The  two  experiments  above  allow  us  to  evaluate  the  co-occurrence  and  port  precedence 
constraints  as  candidates  for  match-interleaving,  with  respect  to  two  particular  input  flow 
graphs  and  a  specific  graph  grammar.  Co-occurrence  constraints  are  excellent  candidates,  in 
terms  of  their  effectiveness,  cost,  and  applicability.  Co-occurrence  constraints  are  effective 
as  evidenced  by  the  vast  decrease  in  the  number  of  items  created  when  they  are  match- 
interleaved.  They  are  particularly  valuable  when  other  binary  constraints  are  weak  which 
is  the  case  in  the  rules  representing  aggregate  data  structure  cliches  that  are  activated  in 
recognizing  the  CST  example.  Co-occurrence  constraints  can  be  checked  cheaply  by  simply 
comparing  two  attribute  values.  Since  aU  nodes  have  control  environments,  co-occurrence 
constraints  are  applicable  to  any  pair  of  nodes  in  a  right-hand  side. 

Port  precedence  constraints  are  also  good  candidates  for  match-interleaving,  although 
not  as  good  as  co-occurrence  constraints.  They  are  modestly  effective  in  reducing  the 
number  of  items  created.  The  cost  of  checking  port  precedence  constraints  incrementally 
is  no  more  than  the  cost  of  checking  them  all  at  once  when  an  item  is  complete.  Their 
applicability  is  limited  to  only  input  ports  of  a  right-hand  side  graph.  That  is,  if  they 
are  included  as  part  of  the  extendibility  criterion,  they  only  apply  to  pairs  of  partial  and 
complete  items  in  which  the  complete  item  is  representing  the  recognition  of  a  left-fringe 
node. 

Implications  for  Chart  Organization 

The  decision  as  to  which  constraints  should  be  interleaved  with  the  matching  process  con¬ 
cerns  which  constraints  should  be  included  as  part  of  the  extendibility  criterion.  The  ex¬ 
tendibility  criterion  is  checked  in  two  steps.  Some  parts  of  the  extendibility  criterion  are 
enforced  when  a  candidate  item  is  retrieved  from  the  chart.  The  rest  are  checked  by  filtering 
the  candidate  items  that  have  been  retrieved.  The  parts  that  are  checked  during  candidate 
retrieval  influence  the  design  of  the  organization  of  the  chart. 

If  a  certain  constraint  is  strong  in  that  it  can  usually  be  satisfied  by  only  a  few  items  and 
this  constraint  refers  to  some  attribute  or  part  of  an  item,  then  it  can  be  used  as  an  index 
into  the  chart.  Node  type  and  edge  connection  constraints  are  very  important  in  reducing 
the  combinatorics  of  matching  many  right-hand  sides.  Currently,  the  chart  is  organized  so 
that  complete  items  are  indexed  by  their  label  and  location  and  partial  items  are  indexed 
by  the  node  types  of  their  immediately  needed  nodes  and  the  locations  at  which  they  are 
needed.  Constraints  on  node  type  and  location  are  therefore  enforced  during  item  retrieval. 
In  the  future,  it  might  be  beneficial  to  index  on  control-environment  information  as  well. 
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6.2.3  Grammar  Facilitates  Reusing  Sub-Search  Space  Exploration 

In  addition  to  constraints,  the  complexity  of  parsing  can  be  reduced  if  the  grammar  captures 
the  commonalities  among  the  flow  graphs  being  recognized  in  its  hierarchical  structure.  The 
grammar  may  specify  that  a  non-terminal  derives  some  sub-flow  graph  that  is  common  to 
several  other  flow  graphs.  When  an  instance  of  this  non-terminal  is  found,  the  results  of 
the  recognition  are  reused  in  recognizing  all  the  flow  graphs  that  contain  it,  rather  than 
repeatedly  matching  the  common  sub-flow  graph. 

In  terms  of  item  trees,  the  effect  of  a  good  grammar  organization  such  as  this  is  that  it 
prevents  multiple  redundant  sub-trees  from  being  grown  within  each  tree.  In  other  words, 
if  the  grammar  captures  commonality,  the  parser  can  avoid  exploring  parts  of  the  search 
space  over  and  over. 

6.2.4  Empirical  Observations  of  Item  Trees 

In  using  the  graph  parser  to  recognize  two  example  simulator  programs,  we  have  found  the 
item  trees  to  be  typically  sparse  and  skinny.  This  section  summarizes  statistics  concerning 
the  characteristics  of  the  item  trees  that  are  created  in  recognizing  the  CST  and  PISIM 
programs. 

In  the  recognition  runs,  both  co-occurrence  and  port  precedence  constraints  are  match- 
interleaved.  Also,  zip-up  creation  was  being  performed  by  the  parser,  since  it  is  needed  to 
recognize  the  simulator  cliches.  Zip-up  items  increase  the  number  of  instances  of  particular 
node  types.  However,  the  number  of  zip-ups  only  negligibly  increases  the  number  of  items 
created  in  parsing.  Since  there  are  so  few  of  them,  they  do  not  significantly  affect  the  node 
type  distribution  nor  the  branching  factor  of  item  trees.  Section  6.2.6  characterizes  the 
number  of  zip-up  items  created  by  the  parser  and  gives  empirical  statistics  for  the  actual 
number  created  in  practice. 

The  “bushiness”  of  the  item  trees  gives  an  indication  of  whether  the  parser  is  encoun¬ 
tering  exponential  behavior.  We  measure  this  property  of  the  trees  in  the  following  ways. 
We  look  at  the  maximum  width  of  the  item  trees  and  observe  how  it  changes  as  the  height 
of  the  item  trees  increases.  The  maximum  width  of  an  item  tree  is  the  maximum,  over  all 
possible  sizes  of  items,  of  the  number  of  items  in  the  tree  of  a  particular  size.  (It  is  the 
same  as  the  maximum  number  of  items  at  a  particular  level  in  an  item  tree.)  If  the  parser 
requires  exponential  space  and  time,  the  maximum  width  will  increase  exponentially  with 
the  height  of  the  tree.  The  height  of  an  item  tree  is  the  maximum  size  of  the  items  in  the 
tree. 

We  also  look  at  the  branching  factor  of  the  trees  and  how  it  varies  as  we  increase  the 
height  of  the  non-terminal  being  recognized.  This  is  done  to  detect  an  exponentiad  buildup 
in  the  number  of  instances  of  non- terminals  as  their  height  in  the  grammar  increases.  (Recall 
the  worst  case  of  this  can  cause  C>(n*^ )  number  of  instances  of  a  non-terminal  at  height  h 
to  be  created  using  a  rule  whose  right-hand  side  is  of  size  k,  as  discussed  at  the  beginning  of 
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tree 

height 

maximum 

maximum  width 

average 

maximum  width 

median 

maximum  width 

■■ 

1 

1.00 

1 

B 

28 

5.84 

3 

28 

10.88 

5 

iH 

13 

6.60 

6 

B 

43 

19.00 

16 

B 

3 

3.00 

3 

Table  6.2:  Tree  height  versus  maximum  width  statistics  for  item  trees  in  CST. 


tree 

height 

maximum 

maximum  width 

average 

maximum  width 

median 

maximum  width 

bb 

1 

1.00 

1 

1 

24 

5.77 

4 

2 

43 

8.09 

5 

3 

9 

6.00 

6 

4 

38 

13.25 

4 

5 

0 

0.00 

0 

6 

0 

0.00 

0 

7 

32 

32.00 

32 

Table  6.3:  Tree  height  versus  maximum  width  statistics  for  item  trees  in  PiSia. 

Section  6.2.)  If  the  parser  is  experiencing  an  exponential  explosion,  the  average  branching 
factor  over  all  the  trees  of  non-terminals  of  a  particular  height  in  the  grammar  will  increase 
as  the  height  is  increased.  Otherwise,  it  will  stay  the  same  or  decrease. 

Maximum  Width 

For  each  item  tree,  we  computed  its  maximum  width,  which  is  the  maximum  number  of 
items  on  any  level  in  the  tree.  Tables  6.2  and  6.3  show,  for  each  tree  height,  the  maximum, 
average,  and  median  maximum  width  of  the  trees  of  that  height. 

As  the  tree  height  increases,  none  of  the  statistics  for  the  maximum  width  of  the  trees 
increase  exponentially.  This  includes  the  maximum  of  the  maximum  widths  of  the  trees 
at  each  possible  height,  which  would  indicate  the  existence  of  even  one  bushy  tree.  For 
the  trees  over  a  particular  height,  the  average  maximum  width  is  typically  much  smaller 
than  the  maximum  maximum  width  and  the  median  maximum  width  is  even  smaller.  This 
means  that  there  are  few  relatively  wide  trees  among  trees  of  a  particular  height. 
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c)  Tree  from  PISIM  example 
(height  =  7,  maximum  width  =  32) 

Figure  6-11:  The  shapes  of  item  trees  having  maximum  maximum  width. 

In  general,  for  trees  of  height  4  to  7  the  maximum  width  level  of  an  item  tree  occurs 
in  the  middle  of  the  tree.  The  width  tapers  off  deeper  in  the  tree,  as  constraints  prune  it. 
Figure  6-11  shows  the  shapes  of  trees  of  height  4  and  7  which  have  the  maximum  maximum 
width.  The  shapes  are  shown  in  terms  of  the  width  of  each  level. 

Branching  Factor 

We  now  observe  how  the  branching  factor  of  an  item  tree  changes  as  we  vary  the  height  of 
the  non-terminal  being  recognized  by  the  items  in  the  item  tree.  Tables  6.4  and  6.5  show  the 
maximum,  average  and  median  branching  factor  over  all  the  item  trees  of  each  possible  non¬ 
terminal  height  for  CST  and  PISIM,  respectively.  In  general,  the  branching  factors  of  item 
trees  produced  in  both  examples  decrease  as  the  height  of  their  non-terminal  increases. 
So  there  is  no  exponential  build-up  occurring  as  non-terminals  higher  in  the  grammar  are 
recognized. 

For  low-level  non-terminals,  the  maximum  branching  factor  is  much  worse  than  the 
average  or  median  branching  factors.  This  shows  that  the  relatively  bushy  trees  for  these 
non-terminals  are  few  in  number.  (For  high-level  non-terminals,  the  maximum  branching 
factor  is  comparable  to  the  average  and  median  branching  factor,  which  is  small  -  only  1 
for  most  high  level  non-terminals  in  the  CST  example!) 

The  table  also  includes  the  maximum  maximum  width  of  all  the  trees  at  each  non¬ 
terminal  height.  This  shows  that  in  general  the  maximum  width  trees  occur  in  recognizing 
low-level  non-terminals. 

These  statistics  show  that  the  item  trees  produced  in  recognizing  the  two  example 
programs  are  typically  skinny.  These  examples  represent  two  real  programs,  showing  the 
good  behavior  of  the  parser  in  practice,  despite  its  potential  for  worst  case  exponential 
performance.  Further  experimentation  is  need  with  other  programs  to  see  how  typical  this 
is  and  what  additional  constrmnts  may  be  needed  to  keep  the  complexity  under  control. 
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maximum 

branching 

factor 

average 

branching 

factor 

median 

branching 

factor 

12.00 

8.17 

6.00 

28.00 

16.34 

6.80 

9.00 

7.75 

8.00 

7.00 

3.01 

2.33 

19.00 

4.76 

3.00 

19.00 

4.76 

3.00 

3.00 

1.50 

1.00 

6.75 

3.16 

1.74 

4.00 

2.33 

2.00 

3.00 

1.83 

1.33 

9.00 

3.25 

1.00 

2.50 

2.50 

2.50 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.50 

1.50 

1.50 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

2.33 

1.67 

1.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

width 


12 


Table  6.4:  CST:  Branching  factor  statistics  for  item  trees  of  non-terminals  over  the  range  of 
possible  node-type  heights. 
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non-terminal 

height 

maximum 

branching 

factor 

average 

branching 

factor 

median 

branching 

factor 

maximum 

maximum 

width 

1 

15.00 

8.35 

7.00 

38 

2 

24.00 

8.90 

4.00 

24 

3 

10.00 

6.46 

6.25 

43 

4 

4.00 

2.69 

2.50 

16 

5 

7.00 

2.13 

2.00 

7 

6 

2.00 

1.51 

1.50 

9 

7 

5.00 

2.73 

2.33 

6 

8 

2.00 

2.00 

2.00 

2 

9 

3.00 

2.33 

3.00 

3 

10 

3.00 

1.87 

1.60 

4 

11 

3.33 

3.33 

3.33 

6 

12 

7.00 

4.50 

2.00 

7 

13 

2.00 

2.00 

2.00 

2 

14 

2.00 

2.00 

2.00 

2 

15 

3.00 

2.50 

2.50 

4 

16 

4.00 

3.00 

4.00 

4 

17 

4.00 

2.50 

1.00 

4 

18 

2.39 

2.39 

2.39 

32 

19 

4.00 

4.00 

4.00 

4 

20 

2.56 

2.56 

2.56 

8 

21 

4.50 

4.50 

4.50 

5 

22 

4.00 

4.00 

4.00 

4 

23 

1.60 

1.60 

1.60 

4 

Table  f  5:  PiSia:  Branching  factor  statistics  for  item  trees  of  non-terminzds  over  the  range 
of  possible  node-type  heights. 
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6.2.5  Modeling  Constraint  Consistency 

We  can  discuss  the  effect  constraints  have  on  the  complexity  of  recognition  in  terms  of  a 
model  of  consistency  Eric  Crimson  [49,  50]  presented  in  analyzing  his  constrained  search 
object  recognition  algorithm.  (This  in  turn  is  based  on  general  analyses  of  the  consistent 
labebng  problem  of  which  constrained  search  and  sub-flow  graph  matching  are  specializa¬ 
tions.) 

In  constrained  search,  sensory  data  are  searched  for  an  object  model,  by  incrementally 
building  a  tree  of  interpretations,  which  are  lists  of  pairings  of  data  and  model  features. 
Each  node  in  the  interpretation  tree  represents  an  interpretation  of  size  k,  where  k  is  the 
level  of  the  node  in  the  tree.  The  size  of  the  interpretation  is  the  number  of  pairings  it 
contains.  Each  of  the  children  of  a  node  that  represents  an  int  irpretation  I  represent  an 
augmentation  of  I  with  an  additional  pairing.  At  each  step,  the  additional  pairings  are  aU 
between  the  same  data  fragment  and  each  of  the  possible  model  features. 

Interpretation  trees  are  analogous  to  item  trees  that  are  produced  when  strict  node 
orderings  are  used.  However,  the  roles  of  model  and  data  fragments  correspond  to  the  roles 
of  the  input  graph  and  right-hand  side  graph,  respectively.  (At  each  step  in  the  item  tree, 
the  partial  items  are  aU  extended  with  complete  items  for  the  same  right-hand  side  node, 
not  the  same  input  graph  node.) 

Unary  and  binary  constraints  are  used  to  prune  the  interpretation  trees.  For  example, 
these  are  edge  length  and  relative  dictance  constraints.  Crimson’s  formulation  captures 
the  notion  that  as  the  size  of  an  interpretation  increases,  the  probability  that  a  random 
matching  of  that  size  is  consistent  in  terms  of  the  constraints  decreases.  This  means  that 
if  the  unary  and  binary  constraints  are  strong  enough,  the  interpretation  trees  will  tend  to 
be  sparse  rather  than  bushy. 

Crimson  defines  the  number  of  analyses  of  a  particular  size  in  terms  of  the  probability 
that  an  analysis  of  that  size  will  be  consistent  in  terms  of  the  constraints. 

The  probability  that  a  set  of  data-model  pairings  will  satisfy  unary  and  binary  con¬ 
straints  even  if  they  are  not  part  of  a  correct  interpretation  depends  on  the  strength  of  the 
constraints.  This  in  turn  depends  on  the  properties  of  the  data  and  models.  In  the  flow 
graph  parsing  problem,  several  input  graph  nodes  of  the  same  type  (ambiguity)  wiU  weaken 
the  unary  node  type  constraints  of  right-hand  sides  containing  that  node-type.  This  wiU 
make  it  more  likely  that  a  random  pairing  of  an  input  graph  node  with  a  right-hand  side 
node  will  satisfy  this  constraint  even  though  the  pairing  is  not  part  of  a  valid  interpretation. 
Similarly,  if  the  input  graph  is  highly  connected,  edge  connection  constraints  are  more  likely 
to  be  satisfied  by  random  pairings. 

Crimson  relates  this  probability  to  properties  of  the  object  recognition  problem,  such 
as  the  amount  of  sensory  error,  the  number  of  model  fragments,  and  the  model  object’s 
perimeter.  He  then  proves  that  the  expected  amount  of  search  to  find  a  correct  interpreta¬ 
tion  is  quadratic  in  the  parameters  (when  all  the  data  belong  to  the  same  object  and  the 
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identity  of  the  object  is  known). 

In  thp  future,  it  would  be  interesting  to  compute  the  analogous  relationship  of  proba¬ 
bilities  of  consistency  to  properties  of  programs  and  cliches,  such  as  node-type  or  control 
environment  distributions  or  number  of  dataflow  dependencies.  The  probabilities  provide 
a  measure  of  the  effectiveness  of  the  constraints.  This  information  could  then  be  used  to 
automatically  generate  advice  concerning  the  optimal  order  of  application  of  constraints. 

Crimson  also  provides  interesting  results  that  point  out  the  need  for  good  indexing  and 
selection  techniques  to  control  the  complexity  of  recognizing  partially  occluded  objects  in 
noisy,  cluttered  scenes.  Indexing  is  the  problem  of  selecting  from  the  model  object  library  a 
small  number  of  model  objects  that  are  likely  to  be  in  the  scene.  Selection  is  the  problem  of 
grouping  together  data  features  that  are  likely  to  have  come  from  the  same  object.  These 
results  carry  over  to  the  program  recognition  domain.  They  will  be  relevant  to  future  work 
in  applying  our  parser  to  the  analogous  task  of  near-miss  recognition,  which  is  the  task 
of  finding  the  “best”  partial  recognition  of  a  cliche.  (Currently,  our  recognition  system  is 
able  to  do  partial  recognition  of  programs,  but  does  not  generate  maximally-sized  partial 
recognitions  of  cliches.)  Section  6.2.7  discusses  this  further. 

6.2.6  Counting  Zip-ups 

The  effect  of  zipping  up  complete  items  is  that  more  instances  of  non-terminals  may  arise. 
This  can  cause  the  branching  factor  to  increase  in  item  trees  for  higher-level  non-terminals. 
Usually,  however,  the  binary  constraints  on  the  inputs  and  outputs  of  the  zipped  up  items 
(especially  the  edge  connection  constraints)  are  powerful  enough  to  quickly  disambiguate 
the  instances  so  the  branching  factor  is  not  affected  much. 

The  number  of  zip-ups  depends  on  the  number  of  instances  of  a  non-terminal  found  at 
a  particular  location  such  that; 

•  either  all  of  the  edges  specified  in  the  candidates’  input  mappings  share  the  same 
source  ports  or  all  of  the  edges  in  their  output  mappings  share  the  same  sink  ports, 
or  both, 

•  none  of  the  input  mappings  of  the  candidates  overlap  (i.e.,  contain  common  edges) 
and  neither  do  the  output  mappings,  and 

•  the  attribute  values  of  the  zipped  up  item’s  left-hand  side  are  defined,  with  respect  to 
the  attribute  combination  function.  (See  Section  3.5.1.)  In  other  words,  zipping  up 
the  candidates  makes  sense  in  terms  of  the  attributes  of  the  resulting  non-terminal 
instance. 

To  count  the  number  of  zip-ups  for  some  non-terminal  or  terminal  node-type,  partition 
items  for  the  node-type  into  maximally-sized  groups  of  items  that  can  be  zipped  up,  ac¬ 
cording  to  the  above  definition.  These  groups  may  overlap.  Within  each  group  of  items, 
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CST 

_ 1 

PiSim 

height 

number  of  zip-ups 

height 

number  of  zip-ups 

0 

3 

0 

7 

1 

4 

1 

10 

2 

3 

2 

5 

3 

1 

3 

0 

4 

0 

4 

0 

5 

1 

5 

0 

>  6 

0 

>  6 

0 

Table  6.6:  Distribution  of  zip-up  count  over  height  of  node-type  in  grammar. 

zip-ups  are  created  from  each  subset  of  the  group  (for  subsets  of  size  greater  than  one).  So, 
for  a  group  g  of  items  that  can  be  zipped  up,  —  Ipj  -  1  items  are  created. 

Empirical  Observations 

Zipping  up  is  actually  a  rare  occurrence  in  practice.  The  reason  is  that  programmers  tend 
not  to  write  redundant  code.  Function-sharing  is  a  common  optimization  employed  to  avoid 
redoing  work  -  for  the  programmer  in  writing  the  code  and  for  the  machine  in  executing  it. 
(Optimizations  usually  add  to  the  complexity  of  recognition,  but  in  this  case,  the  function¬ 
sharing  optimization  actually  helps.) 

The  need  for  zip-ups  does  occur,  but  relatively  infrequently.  Programmers  cannot  (or  do 
not  want  to)  share  all  common  sub-computations.  One  reason  is  that  sometimes  it  is  cheap 
to  recompute  some  value  whenever  it  is  used  and  the  programmer  does  not  want  to  go  to 
the  trouble  of  defining  a  local  variable  to  hold  the  shared  result.  Another  situation  in  which 
redundancy  can  occur  is  in  writing  conditionals  in  which  some  but  not  all  of  the  branches 
contain  common  computations.  The  code  is  sometimes  more  understandable,  and  easier  to 
write  correctly  if  the  computation  is  repeated,  rather  than  shared.  This  situation  is  rare, 
since  it  is  usually  possible  to  combine  the  conditional  cases  that  have  the  same  consequence 
into  a  single  case.  Both  of  these  situations  normally  involve  small  expressions,  containing 
primitive  functions.  So  the  complete  items  that  are  typically  zipped  up  are  for  terminals  in 
the  input  graph  or  low-level  non- terminals. 

In  the  CST  example,  only  12  zip-ups  were  created  (out  of  991  total  items)  and  they  all 
were  zip-ups  of  low  level  non-terminals.  In  PISIN,  only  22  zip-ups  were  created  (out  of 
1224  total  items).  In  both  cases,  they  all  were  zip-ups  of  items  for  terminals  or  low-level 
non-terminals,  as  the  distribution  of  zip-up  count  over  node-type  height  shows  in  Table  6.6. 
(Terminal  node  types  have  height  0.) 

In  both  examples,  the  size  of  the  group  of  candidate  items  being  zipped  up  was  either 
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two  or  three,  with  an  average  of  2.1  and  a  median  of  2. 

(Both  examples  were  run  with  strict  node  orderings  on  the  rules  and  match-interleaved 
co-occurrence  and  port-precedence  constraints.) 

6.2.7  Partial  Node  Orderings 

When  node  orderings  are  not  restricted  to  being  strict,  partial  items  can  have  more  than 
one  immediately  needed  node.  This  causes  more  partial  items  to  be  created.  It  also  causes 
duplicate  items  to  arise,  which  are  worthless  and  are  not  added  to  the  chart. 

In  terms  of  item  trees,  partial  node  orderings  increase  the  branching  factor  of  the  trees. 
A  partial  item  can  be  extended  more  than  once  with  complete  items  for  the  same  node  (if 
there  is  ambiguity)  and/or  with  complete  items  for  more  than  one  node  (if  the  item  has 
more  than  one  immediately  needed  node).  Section  6.2  explored  the  effect  of  ambiguity  on 
the  branching  factor  of  item  trees.  This  section  discusses  the  effect  of  using  partial  node 
orderings. 

The  worst  case  partial  node  ordering  is  no  ordering  at  all:  no  pair  of  right-hand  side 
nodes  is  related.  In  this  case,  the  number  of  different  (non-duplicate)  items  created  in 
recognizing  a  rule’s  right-hand  side  of  size  k  nodes  is  at  least  2*^.  There  is  a  partial  item  for 
each  member  of  the  power  set  of  the  rule’s  right-hand  side  nodes.  (More  than  2*^  items  are 
created  if  there  is  any  ambiguity.)  Contrast  this  with  strict  ordering  in  which  only  k  items 
will  be  created  if  there  is  no  ambiguity. 

With  no  node  ordering,  there  will  be  m  -  1  duplicates  of  an  item  of  size  m.  To  see 
this,  consider  an  item  h  of  size  m.  /I’s  parent  is  one  of  m  possible  parents  (since  there  are 
m  ways  of  choosing  a  subset  of  size  m  -  1  of  /I’s  already  matched  nodes).  All  m  possible 
parents  have  been  created,  since  there  is  no  node  ordering.  One  is  r\e  parent  of  h-  The 
other  m  -  1  are  parents  of  duplicates  of  /i. 

So,  with  no  node  ordering,  the  total  number  of  duplicate  items  created  in  recognizing  a 
right-hand  side  flow  graph  of  size  k  is 

\  "*  / 

This  section  gives  some  empirical  observations  of  the  recognition  of  our  example  pro¬ 
grams  under  the  conditions  of  three  different  node  orderings.  It  then  discusses  the  advan¬ 
tages  of  using  partial  node  orderings  versus  using  strict  node  orderings,  in  terms  of  efRciency 
and  recognition  power.  Finally,  it  discusses  ways  of  choosing  a  rule’s  node  ordering. 

Empirical  Results 

To  get  a  feel  for  how  partial  node  orderings  affect  recognition  performance,  we  perform 
recognition  on  our  two  example  programs,  using  two  different  partial  node  orderings  and 
compare  the  results  to  those  obtained  using  strict  node  orderings. 
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One  partial  node  ordering  is  edge-based  in  that  a  node  ni  is  <„  another  n2  if  n\  has  an 
output  connected  to  an  input  of  n2  and  n2  has  no  input  that  is  an  input  of  the  right-hand 
side  graph.  The  minimal  nodes  in  this  ordering  are  aU  the  nodes  in  the  right-hand  side  that 
are  on  the  left-fringe  (i.e.,  have  input  ports  that  are  inputs  to  the  right-hand  side  flow  graph). 
When  this  node  ordering  is  used,  an  empty  partial  item  for  recognizing  some  rule  has  aU  the 
left-fringe  nodes  of  the  rule’s  right-hand  side  as  its  initial  set  of  immediately  needed  nodes. 
When  a  partial  item  is  created  by  extending  another  partial  item  with  a  complete  item  for 
some  node  x,  all  nodes  connected  to  x  that  have  not  already  been  matched  are  added  to 
the  immediately  needed  node  set. 

With  the  grammar  used  by  the  current  system,  an  edge-based  node  ordering  is  an 
approximation  of  having  no  node  ordering,  which  the  current  recognition  system  cannot 
handle  because  the  current  implementation  is  not  flexible  or  robust  enough.  Edge-based 
orderings  take  advantage  of  the  fact  that  many  of  the  right-hand  sides  of  rules  in  our 
grammar  consist  mostly  of  nodes  that  have  at  least  one  input  that  is  an  input  of  the  right- 
hand  side  flow  graph.  These  nodes  will  all  be  considered  minimal  nodes  in  the  node  ordering. 
If  aU  nodes  of  a  right-hand  side  have  some  input  that  is  a  right-hand  side  flow  graph  input, 
then  none  of  the  nodes  will  be  ordered  with  respect  to  any  other  node. 

The  other  node  ordering  considered  is  topological:  a  node  nj  is  <„  another  712  if  the 
two  nodes  are  connected  by  an  edge  from  ni  to  n2  and  there  is  no  other  node  713  such  that 

<n  and  773  <n  n2.  (This  is  not  exactly  the  same  as  a  topological  sort  of  a  dag  [21], 
since  it  does  not  completely  linearize  the  partial  order  imposed  by  the  edges  of  the  flow 
graph.  Nodes  that  have  no  edges  connected  to  their  inputs  are  not  ordered  with  respect  to 
each  other.) 

Each  program  was  run  with  the  edge-based  node  ordering  and  then  with  the  topological 
node  ordering.  The  results  of  these  two  runs  can  be  compared  to  the  results  of  recognizing 
the  programs  using  a  strict  node  ordering  on  the  rules.  The  strict  node  orderings  are  optimal 
in  that  they  are  designed  to  match  salient  nodes  first.  They  are  manually  assigned  to  the 
grammar  rules. 

Tables  6.7  and  6.8  show  the  results  of  the  three  experimental  runs  on  the  CST  and  PISIN 
programs,  respectively.  In  the  CST  example,  the  strict  node  ordering  is  more  than  200% 
faster  than  the  edge-based  ordering,  reducing  the  total  number  of  items  by  62%,  creating 
less  than  a  third  of  the  number  of  killed  and  extendable  items.  In  fact,  it  creates  less  than 
one  fourth  the  number  of  partial  items  that  are  not  killed  (i.e.,  are  extendable).  The  strict 
node  ordering  does  not  save  as  much  over  the  topological  node  ordering  as  it  did  over  the 
edge-based  ordering.  However,  it  nearly  halves  the  number  of  extendable  items. 

Similarly,  in  the  PISIN  example,  using  the  strict  node  ordering  allows  the  parser  to  run 
238%  faster  than  with  the  edge-based  ordering  and  there  is  a  reduction  by  more  than  50% 
in  the  total  number  of  items  created  with  the  edge-based  ordering.  Less  than  one  fourth  of 
the  number  of  extendable  items  are  produced.  Again,  there  is  only  a  slight  difference  in  the 
number  of  items  created  in  using  the  topological  versus  using  strict  node  orderings. 
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items 

edge- based 

topological 

strict 

successful 

329 

329 

329 

killed 

1296 

491 

446 

extendable 

994 

418 

216 

total 

2619 

1238 

991 

killed+extendable 

2290 

909 

662 

time  (seconds) 

260 

104 

86 

Table  6.7:  Experimental  runs  with  CST  using  three  different  types  of  node  orderings. 


items 

edge- based 

topological 

strict 

successful 

436 

436 

436 

kUled 

953 

597 

525 

extendable 

1073 

356 

263 

total 

2462 

1389 

1224 

killed+extendable 

2026 

953 

788 

time  (seconds) 

501 

187 

148 

Table  6.8:  Experimental  runs  with  PiSin  using  three  different  types  of  node  orderings. 
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It  is  significant  that  the  topological  node  ordering  does  nearly  as  well  as  the  strict 
node  ordering  in  terms  of  efficiency,  since  it  is  based  on  an  easy,  automatable  ordering 
heuristic.  The  reason  that  the  two  node  orderings  yield  comparable  results  is  that  the  rules 
are  typically  long  and  skinny  so  that  the  partial  tf  )logical  node  orderings  are  nearly  strict 
node  orderings.  The  strict  node  orderings  can  be  seen  as  topological  node  orderings  that 
are  improved  using  saliency  information. 

The  strict  node  orderings  that  were  used  in  the  example  runs  above  were  cissigned 
manually  and  were  designed  to  place  node  types  early  in  the  ordering  that  are  salient  with 
respect  to  the  input  graph.  The  measure  of  saliency  of  a  node  type  is  based  on  the  number  of 
instances  of  that  node  type  there  are  in  the  input  graph;  lower  instance  counts  mean  higher 
saliency.  This  takes  into  consideration  non-terminal  node  type  counts,  so  this  assignment  of 
strict  node  orderings  relies  on  knowledge  of  the  input  graph  and  results  of  prior  recognition 
runs.  Below,  we  discuss  ways  of  approximately  measuring  the  saliency  of  non-terminal  node 
types  automatically. 

Partial  Versus  Strict  Node  Orderings 

There  is  no  doubt  that  using  partial  node  orderings  is  more  expensive  than  using  strict  node 
orderings.  However,  using  partial  node  orderings  has  advantages  in  terms  of  flexibility  and 
tolerance  when  a  cliche  is  not  entirely  recognizable.  Since  it  allows  more  than  one  order 
in  which  to  match  right-hand  side  nodes,  if  a  portion  is  missing,  an  order  in  which  the 
other  portion  is  matched  first  can  still  yield  useful  partial  information.  With  a  strict  node 
ordering,  only  one  order  of  matching  is  tried,  so  if  a  node  is  missing,  all  nodes  following  it 
in  the  strict  ordering  will  be  prevented  from  being  matched. 

In  other  words,  partial  node  orderings  aUows  partial  recognition  of  right-hand  sides  of 
rules.  This  is  a  type  of  partial  recognition  which  is  different  from  the  partial  recognition  of 
the  input  graph.  (In  the  program  recognition  domain,  this  is  partial  recognition  of  cliches, 
as  opposed  to  partial  recognition  of  programs,  as  defined  in  Section  3.3.1.)  To  distinguish 
it  from  partial  recognition  of  the  input  graph,  we  use  the  term  near-miss  recognition. 

Near-miss  recognition  is  useful  in  being  able  to  try  harder.  Pure  near-miss  recognition  - 
using  no  node  ordering  -  generates  maximally-sized  partial  analyses.  These  can  give  clues  as 
to  which  small  set  of  constraints  must  be  relaxed,  suspended,  or  satisfied  (e.g.,  by  changing 
the  input  graph)  in  order  for  some  cliche  to  be  recognized.  This  h<is  applications  both  in 
debugging  programs  (in  which  a  programmer  meant  to  use  a  cliche  but  did  so  incorrectly) 
and  in  learning  new  cliches. 

In  general,  with  partial  node  orderings,  the  partial  analyses  can  become  larger  and  more 
plentiful  than  with  strict  node  orderings.  This  reveals  a  trade-off  between  the  efficiency  of 
strict  node  orderings,  which  cut  off  analyses  as  soon  as  constraints  are  violated,  and  the 
near-miss  recognition  power  afforded  by  partial  node  orderings,  which  explores  more  of  the 
search  space,  “tolerating”  constraint  violations  to  gather  more  information  about  the  input 
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graph. 

To  do  near-miss  recognition  efficiently,  the  parser’s  search  must  be  focused  on  a  smaU 
number  of  non-terminals  at  a  smadl  number  of  places  in  the  input  graph.  Crimson  provided 
theoretical  confirmation  of  this  in  his  study  of  constrained  search.  The  mapping  between 
constrained  search  and  right-hand  side  matching  makes  his  results  applicable  to  near-miss 
recognition  by  flow  graph  parsing  as  well. 

Crimson  found  that  constrained  search  is  efficient  when  indexing  and  selection  are  per¬ 
fect,  as  discussed  in  Section  6.2.5.  However,  an  exponential  amount  of  work  is  needed  to  tell 
that  a  possibly  partially  occluded  object  model  is  not  in  a  scene,  even  when  good  (but  not 
perfect)  selection  techniques  are  performed.  So  it  is  important  that  indexing  techniques  are 
used  to  narrow  down  the  library  of  models,  rather  than  sequentially  searching  through  the 
library  and  using  the  exponential  process  to  rule  out  incorrect  models.  Also,  an  exponential 
amount  of  work  is  needed  to  find  an  object  model  in  a  cluttered  scene  if  adequate  selection 
techniques  are  not  used  to  distinguish  the  object  from  the  noise.  This  is  the  case  even  if 
perfect  indexing  is  done.  So  both  good  indexing  and  good  selection  are  needed  to  efficiently 
perform  recognition  of  partially  occluded  objects  in  cluttered  scenes. 

A  few  program  recognition  researchers,  such  as  Johnson  [65],  Lukey  [87],  and  Murray 
[95],  have  worked  on  the  problem  of  guiding  the  recognition  system  to  a  “best”  partial 
analysis  in  the  context  of  program  debugging  applications.  They  use  heuristics  based  on 
saliency,  mnemonic  names,  and  partial  analysis  size,  for  example.  Section  6.4  gives  some 
suggestions  for  ways  of  incorporating  other  possible  indexing  and  selection  techniques  into 
the  current  recognition  system. 

Choosing  a  Node  Ordering 

The  node  ordering  of  a  rule  determines  the  order  in  which  individual  unary  and  binary 
constraints  are  applied.  The  best  order  is  one  in  which  stronger  constraints  are  applied 
first.  An  automatic  assignment  of  node  orderings  to  rules  can  look  at  the  structure  of  the 
rules’  right-hand  sides  and  at  the  input  graph  to  get  clues  as  to  which  ordering  is  most 
likely  to  impose  stronger  constraints  earlier. 

Unary  Constraints 

The  unary  node-type  constraints  are  strongest  for  salient  node  types.  So  a  node-ordering 
in  which  salient  nodes  are  matched  first  is  best.  There  are  two  useful  notions  of  saliency. 
One  notion  is  a  node  type  that  is  rare  in  the  input  graph.  The  other  is  a  node  type  that 
only  appears  in  a  few  grammar  rules. 

The  unary  node-type  constrmnt  for  nodes  that  are  salient  with  respect 
graph  is  strong  in  that  they  reduce  the  branching  factor  of  item  trees.  Applying  them  early 
can  help  disambiguate  partial  analyses  while  they  are  still  small.  (Reduction  of  branching 
is  most  beneficial  near  the  top  of  item  trees,  since  binary  constraints  can  usually  keep  the 


220 


branching  factor  down  at  lower  levels.) 

Ideally,  node  orderings  that  are  based  on  saliency  of  node  types  with  respect  to  the 
input  graph  should  take  into  account  the  number  of  instances  of  non-terminal  as  well  as 
terminal  node  types  in  the  input  graph.  However,  this  requires  knowledge  of  the  results  of 
recognition. 

We  can  use  heuristics  to  automatically  produce  node  orderings  that  approximate  this 
ideal  assignment.  Given  a  right-hand  side,  we  can  compute  a  frequency  number  for  each 
right-hand  side  node.  The  nodes  of  a  rule’s  right-hand  side  are  then  ordered  from  smallest 
to  largest  frequency  of  their  node-type,  so  that  salient  nodes  are  earlier  in  the  ordering. 
(This  is  not  necessarily  a  strict  node  ordering.) 

For  each  terminal,  the  frequency  number  is  the  number  of  nodes  in  the  input  graph  with 
the  same  type.  For  a  non-terminal  A,  take  each  rule  R  for  A  and  recursively  compute  the 
frequency  numbers  of  the  nodes  in  iZ’s  right-hand  side,  choosing  the  minimum  frequency 
number  as  the  frequency  of  A  with  respect  to  R.  Finally,  combine  these  frequency  numbers 
over  all  the  rules  for  A  to  get  A’s  frequency.  The  combination  function  (e.g.,  sum,  max, 
average)  chosen  depends  on  how  conservative  or  optimistic  we  want  the  heuristic  to  be. 

The  advantage  of  matching  nodes  that  are  salient  with  respect  to  the  grammar  first  is 
that  the  growth  of  an  item  tree  for  a  rule  does  not  begin  until  the  salient  node  is  found. 
This  has  the  effect  of  only  activating  the  matching  process  for  a  particular  rule  when  it  is 
worth  it  (i.e.,  when  the  rule’s  right-hand  side  or  a  near-miss  of  it  is  likely  to  exist  in  the 
input  graph).  This  is  a  form  of  indexing.  It  helps  speed  up  recognition  and  it  also  produces 
better  partial  analyses  for  near-miss  recognition. 

An  issue  that  arises  when  using  saliency  measures  based  on  the  grammar  is  that  as  the 
parsing  proceeds,  the  grammar  is  changing.  As  the  set  of  item  trees  is  pruned  away,  the  set 
of  grammar  rules  under  consideration  is  effectively  becoming  smaller.  Since  the  saliency  of  a 
node-type  is  relative  to  the  grammar,  saliencies  change  as  the  grammar  changes.  Matching 
a  node  that  is  salient  with  respect  to  an  entire  grammar  might  narrow  down  the  grammar 
to  a  few  rules  that  contain  that  node.  Then,  with  respect  to  these  rules,  there  are  other 
salient  node  types  (which  might  not  have  been  salient  with  respect  to  the  entire  grammar). 
These  salient  node  types  should  be  matched  first,  to  disambiguate  between  the  possibilities, 
and  so  on.  The  point  is  that  saliency  with  respect  to  a  grammar  changes  as  the  grammar 
\  changes,  so  if  we  are  basing  our  node  orderings  on  it,  we  wiU  have  to  change  the  node 
orderings  dynamically  as  parsing  proceeds. 

Binary  Constraints 

Node  orderings  can  abo  be  created  to  force  strong  binary  constraints  to  be  checked  earlier. 
For  example,  the  topological  partial  node  ordering  used  in  the  experimental  runs  was  effec¬ 
tive  in  reducing  complexity.  It  ensured  that  no  node  was  matched  until  all  nodes  preceding 
it  in  the  right-hand  side  flow  graph  had  been  matched.  This  meant  that  when  a  node  is 
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matched,  there  are  edge  connection  constraints  applicable  to  it  and  its  preceding  nodes. 
The  partial  items  are  always  extended  by  complete  items  for  nodes  that  can  be  constrained 
the  most  by  the  preceding  nodes. 

Another  ordering  heuristic  is  to  match  nodes  earlier  that  have  more  binary  constraints 
applied  to  them.  For  example,  match  those  with  more  output  edges,  before  those  with  few 
outputs,  or  match  those  that  are  constrained  to  co-occur,  before  those  that  are  not.  The 
advantage  of  these  heuristics  is  that  they  require  no  knowledge  of  the  input  graph. 


6.2.8 

RecaJl 


Summary  of  Item  Count 

from  Section  6.1.2  that  the  overall  cost  of  the  parsing  algorithm  is 

I^tI  *  {(^agenda— add  "h  C agenda— retrieve  "V  C duplicate— test)  "t" 

\^e\  *  ^extend  "b 

\tc\  *  {C chart  ^add  "1“  C combination^ lookup)  "I" 

\Ir\^  (^instantiate— empty  d" 

\In\  *  ('instantiate— terminal  d" 

\Iz\  *  ('zip— up  d" 

|f/l  *  (C constraints-check  d"  (^ zip-up-lookup) 


The  number  of  items  created  during  initialization  for  the  terminal  nodes  of  the  input 
graph  (l/fil)  is  n,  the  number  of  nodes  in  the  input  graph.  The  number  of  empty  partial 
items  also  created  during  initialization  (|/a|)  is  the  number  of  rules  in  the  grammar  (|P|). 
This  section  has  discussed  the  number  of  items  created  by  extension  and  zip-up  and  how 
constraints  and  node  orderings  influence  the  size  of  these  sets  (|/£;|  and  \Iz\)-  The  number 
of  items  in  the  chart  is  Iq  =  (|/e|  —  Ifcl)  d-  n  d-  |P|,  where  lo  is  the  set  of  duplicate  items. 
If  strict  node  orderings  are  used,  then  |/£>|  =  0.  The  set  of  complete  items  that  enter  the 
chart  (//)  are  those  in  /„  and  Iz  and  the  subset  of  the  complete  items  created  by  extension 
that  contains  no  duplicate  items.  The  total  number  of  items  \It\  =  \Ie\  d-  n  d-  |/^|  d-  \Iz\  = 
\Ic\  d-  \Id\- 

We  now  detail  the  costs  of  the  actions  that  are  performed  on  each  of  these  types  of 
items. 


6.3  Component  Costs 

The  sizes  of  the  various  types  of  item  sets  are  weighted  in  the  complexity  formula  by  the 
costs  of  applying  the  basic  parser  actions  to  each  type  of  item.  The  terms  in  the  formula 
are  ordered  by  the  typical  size  of  the  set  of  items  in  the  term,  based  on  the  empirical  study 
of  recognizing  CST  and  PISIM.  The  first  three  terms  are  dominant.  It  is  best  for  the  costs 
weighting  them  to  be  small.  We  will  consider  the  cost  of  each  of  the  parser’s  actions  in  the 
order  in  which  it  appears  in  the  complexity  formula. 
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The  cost  of  adding  to  and  retrieving  an  item,  C agenda-add  and  C agenda-retrieve,  are 
small  constants  in  the  current  implementation.  They  are  implemented  as  simple  queue 
operations.  In  general,  however,  they  may  be  more  complex  operations,  depending  on  the 
type  of  structure  imposed  on  the  agenda  to  implement  more  complicated  search  strategies. 

C duplicate-teat  is  the  cost  of  testing  whether  an  item  is  a  duplicate  of  an  existing  item 
already  in  the  chart.  There  are  two  different  tests  used,  depending  on  whether  the  item  is 
partial  or  complete. 

To  describe  the  test  of  partial  items,  we  need  to  define  two  more  parts  of  the  structure 
of  items.  One  is  a  set  of  sub-items  which  are  complete  items  that  represent  the  recognition 
of  the  nodes  that  have  been  matched  so  far  in  the  item  rule’s  right-hand  side.  These  are  the 
items  that  have  successively  extended  partial  items  to  ultimately  result  in  this  item.  The 
other  new  part  of  items  is  a  set  of  super-items  which  are  items  that  resulted  from  extending 
a  partial  item  with  this  item.  Only  complete  items  have  super-items.  An  item  might  have 
more  than  one  super-item  if  a  sub-derivation  is  being  shared  between  two  derivation  trees. 
(Super-items  and  sub-items  of  an  item  It  are  different  than  the  item’s  parent  or  children 
in  item-trees.  Links  to  super-  and  sub-items  encode  the  structure  of  the  derivation  graphs 
generated  by  the  parser.  The  links  to  parent  and  children  items  in  an  item  tree  show  the 
history  of  extensions  performed  on  items  for  the  same  rule.) 

Each  partial  item  will  have  a  sub-item  for  each  of  the  nodes  of  its  rule’s  right-hand  side 
that  have  been  matched  so  far.  If  a  duplicate  Id  of  a  partial  item  Ip  exists.  Id  will  share  all 
of  its  sub-items  with  Ip.  So,  given  any  partial  item  Ip,  we  can  teU  if  a  duplicate  of  it  exists 
by  taking  any  one  of  its  sub- items  /,  and  looking  for  one  of  its  super-items  (other  than  Ip) 
that  h2is  the  same  set  of  sub-items  matched  to  the  same  nodes  as  Ip.  If  none  is  found,  the 
partial  item  is  not  a  duplicate.  The  average  cost  is  polynomial  in  the  average  number  of 
super-items  an  item  can  have  and  the  number  of  sub-items  being  compared  (which  is  the 
size  of  the  partial  item  being  tested  and  which  is  less  than  the  size  of  its  rule’s  right-hand 
side).  The  average  number  of  super-items  is  2.84  in  CST  and  2.07  in  PISIN.  Right-hand  side 
sizes  range  from  1  to  7  nodes. 

To  test  whether  a  duplicate  of  a  complete  item  Ic  exists,  we  look  in  the  chart  for  items 
with  the  same  label  as  Ic  at  the  location  of  Ic.  For  each  location  pointer  in  the  input 
and  output  mappings  of  Ic,  the  items  for  Ic's  label  at  that  location  pointer  are  retrieved. 
The  sets  of  items  retrieved  for  the  location  pointers  are  intersected.  The  average  cost  is 
polynomial  in  the  average  number  of  location  pointers  per  input  or  output  mapping  (3.21 
in  CST,  2.92  in  PISIN)  and  the  average  number  of  items  retrieved  (2.91  in  CST,  2.61  in  PISIN). 

The  number  of  location  pointers  in  the  mappings  is  not  the  same  as  the  number  of 
inputs  and  outputs  of  the  left-hand  side  non-terminal  of  an  item’s  rule  or  the  number  of 
internaJ  edges  to  immediately  needed  non-terminals.  It  depends  on  the  degree  of  fan-out  or 
fan-in  of  edges  in  the  input  graph,  and  on  the  bushiness  of  nested  location  pointers  which 
represent  aggregation.  (In  terms  of  the  program  recognition  application,  the  size  of  the 
nested  location  pointers  representing  aggregation  depends  on  the  complexity  of  the  cliched 


data  structure  -  how  many  parts  it  has  and  how  many  its  sub-parts  have,  and  so  on.) 

The  cost  of  extension  Cextend  is  the  sum  of  the  cost  of 

•  copying  an  item:  linear  in  the  sizes  of  its  parts,  such  as  lists  of  callers  and  sub-items. 

•  updating  input  and  output  mappings:  polynomial  in  the  number  of  location  pointers 
in  the  input  and  output  mappings  of  the  complete  item. 

•  comparing  location  pointer  tuples  on  the  inputs  and  outputs  of  adjacent  non-terminals 
and  propagating  st-thru  matches:  polynomial  in  the  number  of  edges  in  the  right-hand 
side  and  the  number  of  location  pointers  per  right-hand  side  edge.  (There  may  be 
more  than  one  location  pointer  on  an  edge  due  to  fan-in  or  fan-out  and  aggregation.) 
The  average  number  of  edges  in  a  right-hand  side  is  0.53  and  the  average  number  of 
location  pointers  per  edge  is  2.63  in  CST  and  4.16  in  PISIM. 

The  cost  of  recording  an  item  (complete  or  partial)  in  the  chart,  Cchart-addi  Js  linear 
in  the  number  of  location  pointers  in  the  input  and  output  mappings  of  the  item.  This  is 
because  the  item  is  recorded  in  the  chart  multiple  times,  once  for  each  location  pointer. 
(For  partial  items,  the  “output  mappings”  are  the  sets  of  location  pointers  on  the  edges 
to  immediately  needed  non-terminals.)  The  chart  is  broken  into  two  parts,  one  containing 
only  complete  items  and  the  other  containing  only  partial  items.  The  set  of  complete  items 
is  indexed  on  the  label  of  the  item  and  on  the  location  pointers  of  the  item’s  input  and 
output  mappings.  The  set  of  partial  items  is  indexed  on  the  location  pointers  and  node 
types  of  the  item’s  immediately  needed  non-terminals.  This  makes  it  easier  to  look  up  all 
complete  items  for  a  particular  node  type  at  a  particular  location  (to  combine  with  a  given 
partial  item),  and  to  look  up  all  partial  items  needing  a  particular  node  type  at  a  particular 
location  (to  combine  with  a  given  complete  item).  The  average  number  of  times  an  item  is 
entered  into  the  chart  is  7.51  in  CST  and  6.35  in  PISIN. 

Ccomhination-lookMp  is  the  cost  of  looking  up  partial  or  complete  items  to  combine  with 
an  item  that  is  entering  the  chart.  Given  a  complete  item  for  a  non- terminal  A,  looking 
up  partial  items  for  it  to  extend  involves  taking  each  location  pointer  in  the  mappings  of 
the  complete  item  and  looking  up  all  partial  items  that  immediately  need  A  at  the  location 
pointer.  The  candidate  items  retrieved  are  organized  by  item  and  for  each  candidate, 
a  validity  check  is  performed.  The  validity  check  is  an  application  of  unary  and  binary 
constraints.  So,  the  cost  of  looking  up  partial  items  is  a  polynomial  in  the  number  of 
location  pointers  in  the  mappings,  the  number  of  candidate  items  retrieved,  and  the  cost  of 
applying  the  unary  and  binary  constraints. 

Given  a  partial  item  that  immediately  needs  non-terminals  i4i,...,y4„,  a  similar  cost  is 
incurred  in  looking  up  complete  items  for  each  of  these  non- terminals.  This  cost  is  summed 
over  the  sets  of  location  pointers  on  the  edges  going  to  each  of  the  immediately  needed 
non-terminals. 
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The  cost  of  checking  parse- interleaved  constraints  Cconatraint-check  is  hard  to  character¬ 
ize,  since  the  constraint  expressions  can  be  arbitrarily  complex.  However,  in  the  current 
system,  the  constraints  applied  are  very  simple  and  this  term  contributes  little. 

The  cost  of  looking  up  items  to  zip  up  with  a  given  item  I  a  is  C  zip-up-lookup-  This 
involves  looking  up  each  item  Ic  for  Ia's  label  A  that  satisfies  the  following  conditions: 

•  either  aU  of  the  edges  pointed  to  by  the  location  pointers  in  /c’s  and  Ia's  input 
mappings  share  the  same  source  ports  or  all  of  the  edges  pointed  to  by  the  location 
pointers  in  their  output  mappings  share  the  same  sink  ports,  or  both, 

•  none  of  the  input  mappings  of  either  item  overlap  (i.e.,  contain  common  location 
pointers)  and  neither  do  the  output  mappings,  and 

•  the  attribute  values  of  the  zipped  up  item’s  left-hand  side  are  defined,  according  to 
the  attribute  combination  function. 

The  cost  of  doing  this  is  polynomial  in  the  number  of  location  pointers  contained  in  the 
input  and  output  mappings  of  I  a,  in  the  number  of  items  retrieved  per  location  pointer, 
and  in  the  cost  of  applying  the  attribute  combination  function. 

The  costs  of  creating  empty  partial  items,  Cinstantiate-empty,  and  complete  items  for 
terminal  nodes,  Cinatantiate-terminah  during  instantiation  are  both  small  constants. 

The  cost  of  zipping  up  a  set  of  items  Czip-up  is  polynomial  in  the  number  of  items 
being  zipped  up  (for  the  example  programs,  the  typical  number  is  2  or  3)  and  in  the  cost 
of  zipping  up  the  parts  of  the  items  (e.g.,  unioning  sets  of  callers). 

6.4  Other  Performance  Improvements 

This  section  contains  suggestions  for  improving  the  performance  of  the  parser.  These  are 
useful  when  constraints  are  not  strong  enough  to  prune  the  parser’s  search  adequately.  They 
are  also  important  if  the  parser  is  to  be  used  for  near-miss  recognition  in  the  future.  Most 
of  these  can  benefit  from  advice  from  an  external  agent. 

6.4.1  Decomposition 

Parsing  smaller  flow  graphs  can  be  easier  than  parsing  larger  ones  if  the  smaller  flow  graphs 
are  less  ambiguous.  Decomposing  an  input  graph  and  then  focusing  the  parser  only  on 
sub-flow  graphs  within  the  decomposition  boundaries  can  speed  up  recognition. 

John  Hartman  [55]  demonstrates  the  advantage  of  decomposition  in  program  recog¬ 
nition.  He  provides  an  efficient  recognition  technique  for  cliched  control  concepts,  which 
hierarchically  decomposes  a  program  represented  as  a  control  flow  graph  into  propers  (single 
entry /single  exit  control  flow  sub-graphs)  and  performs  simple  graph  matching  within  the 
propers. 
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This  section  gives  some  examples  of  program  domain-specific  heuristic  decompositions 
that  can  be  used  to  focus  our  parser.  They  are  all  static  decompositions  that  occur  before 
parsing  is  begun.  Section  6.4.3  discusses  dynamic  decompositions. 

Subroutinization  provides  one  type  of  heuristic  decomposition.  The  parser  can  be  forced 
to  recognize  non-terminals  only  within  the  boundaries  of  a  subroutine  or  module.  (When 
using  this  heuristic,  there  is  no  need  to  “flatten”  the  program  by  expanding  out  all  subrou¬ 
tines  within  their  callers.  When  the  flow  graph  for  an  entire  subroutine  body  is  recognized 
as  a  non-terminal  A,  all  nodes  representing  calls  of  that  subroutine  can  be  replaced  by  a 
node  of  type  A.) 

An  analogous  decomposition  can  be  made  based  on  data  structure  organization.  The 
idea  is  to  require  a  non-terminal  to  be  recognized  only  in  sub-flow  graphs  whose  nodes  all 
represent  operations  that  are  acting  on  parts  of  the  same  user-defined  data  struf  ture.  For 
example,  1+  and  AREF  occur  all  over  the  input  graph,  but  we  should  not  pair  them  up  as  an 
instance  of  the  Stack-Pop  cliche  if  one  is  applied  to  the  Tail  part  of  a  user-defined  structure 
Queue  and  the  other  is  applied  to  the  Instructions  part  of  a  Handler.  Since  our  cliches  are 
primarily  based  on  dataflow,  this  partitioning  seems  natural.  A  single  dataflow  slice  is  not 
always  the  best  unit  of  decomposition,  since  aggregate  data  structures  typically  involve  a 
bundle  of  slices.  This  partitioning  allows  a  bundle  of  slices  to  be  considered  as  a  unit. 

Both  of  these  decompositions  work  best  if  the  programmer’s  decomposition  of  the  pro¬ 
gram  into  procedural  and  data  abstractions  is  very  close  to  a  typical  way  programs  in  that 
domain  are  decomposed. 

The  main  problem  with  focusing  the  parser  on  each  partition  independently  is  that 
completeness  can  be  lost  if  cliches  occur  across  the  partition  boundaries.  A  more  flexible 
partitioning  technique  is  to  augment  the  extendibility  criterion  of  the  parser  with  a  binary 
partitioning  constraint  which  requires  that  a  complete  item  can  only  extend  a  partied  item 
if  all  of  the  partial  item’s  sub-items  and  the  complete  item  represent  the  recognition  of 
sub-flow  graphs  in  the  same  partition.  Combination  attempts  that  fail  this  constraint  can 
be  postponed,  rather  than  eliminated  altogether.  This  allows  certain  combinations  to  be 
preferred  over  others,  while  allowing  less  favorable  combinations  to  still  be  tried  in  a  try- 
harder  phase. 

The  drawback  with  this  scheme  is  that  more  combinations  between  pairs  of  items  will 
be  attempted.  When  parsing  is  focused  on  sub-flow  graphs  independently,  the  combinations 
that  cross  boundaries  are  not  even  attempted. 

An  advantage  of  incorporating  a  partitioning  constraint  into  the  extendibility  criterion  is 
that  it  can  be  selectively  applied.  It  would  be  like  any  other  match-interleaved  constraint  in 
that  it  can  be  specified  on  a  rule-by-rule  basis  to  apply  to  certain  (not  necessarily  all)  nodes 
of  each  rule’s  right-hand  side.  The  match-interleaved  co-occurrence  constraint  currently 
used  by  the  parser  can  be  seen  as  a  partitioning  constraint  that  requires  certain  right-hand 
side  nodes  to  occur  within  the  same  control-environment  boundary. 

Finally,  the  recognition  system  can  make  use  of  advice  from  an  external  agent,  that  has 
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access  to  more  information  about  the  program  than  is  found  in  the  source  code.  People 
can  often  break  up  the  program  into  pieces  that  “go  together”  in  that  they  provide  a 
particular  functionality  or  belong  to  the  same  abstract  domain-specific  concept.  They  base 
this  decomposition  on  design  documentation  and  program  comments  or  even  just  names 
of  subroutines  and  variables.  (As  part  of  the  DESIRE  project  [12,  13]  Josiah  Hoskins  has 
proposed  a  neural-network-bjised  approach  to  automating  this  process.)  This  information 
can  be  used  to  focus  the  recognition  system  on  particular  sub-flow  graphs  and  also  to  suggest 
cliches  to  look  for  within  them  (i.e.,  index  into  the  cliche  library  -  see  the  next  section). 

6.4.2  Indexing 

Efficiency  can  be  gained  not  only  by  reducing  the  focus  of  the  parser  to  smaller  sub-flow 
graphs,  but  also  by  reducing  its  focus  to  a  smaller  subset  of  the  grammar.  For  large 
grammars,  it  is  advantageous  for  recognition  to  be  sub-linear  in  the  size  of  the  grammar. 

The  current  parser  makes  use  of  indexing  to  some  extent  in  that  it  only  creates  (non¬ 
empty)  items  for  rules  when  part  of  the  rule’s  right-hand  side  has  been  found  in  the  input 
graph.  The  chart’s  structure  allows  the  parser  to  index  on  the  node  type  found  to  retrieve 
partial  items  that  immediately  need  it.  Heuristics  have  been  discussed  in  Section  6.2.7  for 
choosing  a  node  ordering  that  will  force  salient  nodes  to  be  matched  first.  This  stunts  the 
growth  of  item  trees  until  it  is  likely  that  a  non-terminal  instance  or  a  near-miss  of  one 
exists  in  the  input  graph. 

Advice  can  also  be  given  to  the  program  recognition  system  from  an  external  agent, 
based  on  expectations  about  which  cliches  are  likely  to  be  found  in  the  program.  This  can 
be  used  to  narrow  down  the  grammar  given  to  the  parser. 

6.4.3  Interleaved  Decomposition  and  Indexing 

We  can  also  interleave  indexing  and  decomposition  (selection)  techniques  with  the  parsing 
process.  The  idea  is  to  use  strict  node  orderings  first  and  then  try  harder  later  by  giving 
certain  partial  items  partial  node  orderings,  expanding  their  im.mediately  needed  nodes 
based  on  the  new  orderings,  and  returning  them  to  the  agenda  to  continue  parsing.  Advice 
from  an  expectation-driven  component  or  heuristics  can  be  used  to  choose  the  partial  items 
to  “encourage”.  An  example  heuristic  might  be  to  choose  partial  items  that  have  started 
recognizing  non-terminals  in  an  area  of  the  input  graph  in  which  no  cliche  has  been  fuUy 
recognized.  Another  heuristic  is  to  choose  the  partial  items  that  have  the  salient  nodes  of 
their  right-hand  side  matched  already. 

Interleaved  indexing  and  decomposition  techniques  have  an  advantage  over  static  tech¬ 
niques  that  are  applied  before  recognition  in  that  they  can  make  use  of  deeper  knowledge 
about  the  input  graph  based  on  the  previous  recognition  results. 

Hierarchically  representing  patterns  in  a  graph  grammar  facilitates  this  process.  If  a 
“flat”  pattern  were  searched  for,  using  a  strict  node  ordering,  the  search  would  end  as 
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soon  as  the  parser  fails  to  match  the  “next”  node  in  the  ordering.  With  a  hierarchical 
organization,  more  parts  of  the  pattern  can  be  recognized  and  used  to  make  a  more  informed 
decision  about  which  candidate  partial  analyses  should  be  pursued  further  with  a  partial 
node  ordering.  This  information  can  also  be  used  to  decide  which  node  ordering  to  try. 

6.4.4  Avoiding  Unnecessary  Copying 

When  a  partial  item  is  extendable  by  a  complete  one,  a  copy  of  the  partial  item  is  created 
and  the  copy  is  extended.  The  reason  is  that  this  helps  the  parser  deal  with  ambiguity 
and  allows  it  to  perform  partial  recognition  and  incremental  analysis.  (See  Section  3.5.) 
However,  sometimes  a  large  number  of  the  copies  made  are  unnecessary,  either  because  the 
input  graph  is  not  ambiguous,  it  does  not  contain  multiple  instances  of  some  node  types,  or 
it  is  expected  to  remain  static.  This  section  suggests  ways  of  avoiding  unnecessary  copying. 

We  can  identify  unnecessary  copies  retrospectively  by  looking  for  partial  items  that  have 
been  extended  with  only  one  complete  item  for  the  same  immediately  needed  node.  In  the 
CST  example  (using  strict  node  orderings),  the  percentage  of  copies  that  were  unnecessary 
is  13.5%.  The  percentage  of  the  total  number  of  items  that  are  the  results  of  unnecessary 
copies  is  10.9%.  In  the  PISIM  example  (using  strict  node  orderings),  the  percentage  of  copies 
that  were  unnecessary  is  14.7%.  The  number  of  items  that  are  the  result  of  an  unnecessary 
copy  as  a  percentage  of  the  total  number  of  items  is  11.6%. 

Unnecessary  copies  contribute  to  both  the  height  and  width  of  item  trees.  When  strict 
node  orderings  are  used,  they  contribute  only  to  the  height  of  trees. 

The  following  are  a  few  techniques  for  avoiding  copying. 

1.  Lazy  copying:  Make  a  copy  only  when  it  is  necessary.  Extend  partial  items  with 
complete  items  without  copying.  However,  when  an  alternative  complete  item  arises 
for  an  already  matched  node  A  in  some  item  /q,  make  a  copy,  /i,  of  Iq  and  restore  it 
to  the  state  Iq  was  in  before  the  old  complete  item  Lai  was  used  to  extend  it.  To  do 
this,  we  remove  any  links  it  has  to  super-items  (since  only  complete  items  can  have 
super-items).  We  must  also  find  out  which  sub-items  of  Ii  must  be  retracted.  These 
are  Lai  and  all  complete  items  that  extended  it  after  Iai,  which  can  be  computed  from 
the  node  ordering  and  a  history  of  the  immediately  needed  sets.  These  are  removed 
from  Ii ’s  set  of  sub-items  and  all  information  associated  with  Ii  that  was  derived  from 
them  is  removed.  (This  requires  keeping  track  of  dependencies  of  parts  of  an  item  on 
the  sub-item  parts,  such  as  its  inputs  and  outputs.  It  also  requires  allowing  partial 
items  to  be  indexed  based  on  already  matched  nodes  as  well  as  immediately-needed 
nodes,  so  that  new  complete  items  can  be  paired  up  with  them.)  Once  the  retraction 
is  finished,  Ii  can  be  extended  with  the  alternative  complete  item. 

This  scheme  is  only  worthwhile  when  the  majority  of  copying  is  unnecessary.  It 
can  be  applied  selectively  to  certain  extensions  if  the  parser  has  been  given  advice 
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that  certain  node-types  are  not  likely  to  be  found  more  than  once  or  in  a  partially 
ambiguous  situation. 

2.  Structure-sharing:  A  common  technique  to  avoid  copying  when  there  is  little  change 
between  the  original  and  the  copy  is  to  share  the  common  structure.  The  parser 
can  store  one  “original  item”  per  rule  plus  a  log  of  augmentations,  representing  the 
successive  extensions.  This  is  a  more  compact  way  to  record  intermediate  states  in  the 
search.  This  technique  is  used  in  resolution  theorem  proving  [14]  and  in  unification- 
based  grammar  parsing  [67,  104). 

3.  Estimating  Number  of  Instances:  We  can  heuristicaUy  count  the  maximum  possible 
number  of  instances  of  a  particular  node  type,  based  on  the  node  type  distribution  of 
the  input  graph.  As  soon  as  the  maximum  number  of  instances  of  a  node-type  A  are 
entered  in  the  chart,  if  a  partial  item  immediately  needing  A  arises,  the  parser  can 
tell  whether  there  is  more  than  one  possible  complete  item  for  A  that  can  extend  it. 
If  there  is  only  one,  then  the  partial  item  need  not  be  copied  before  being  extended. 
However,  this  scheme  is  only  beneficial  if  the  heuristic  for  counting  instances  is  good"* 
and  most  of  the  partial  items  that  need  a  node-type  A  enter  the  chart  after  the 
maximum  number  of  instances  of  A  have  been  found.  An  alternative  is  to  use  a  less 
conservative  heuristic  that  computes  a  lower  bound  on  the  number  of  instances  in 
conjunction  with  lazy  copying.  This  allows  copying  to  be  prevented  earlier,  without 
sacrificing  safety. 

4.  Restricted  Control  Strategy:  The  parser  can  be  forced  to  produce  ail  complete  items 
for  node-types  of  a  particular  height  h  in  the  grammar  before  going  up  to  the  next 
height  /i  -|-  1,  starting  with  the  terminal  node  types  {h  =  0).  This  guarantees  that  all 
instances  of  a  node-type  A  have  been  found  when  a  partial  item  immediately  needing 
A  enters  the  chart.  The  partial  item  need  not  be  copied  before  being  extended  if  only 
one  complete  item  for  A  can  extend  it.  The  disadvantage  is  that  the  control  of  the 
parser  is  severely  restricted. 

The  decision  and  technique  used  to  avoid  copying  depends  on  the  severity  of  the  problem 
of  unnecessary  copying.  In  the  two  example  programs,  it  is  not  severe  enough  to  merit  the 
overhead  of  these  techniques. 

6.5  Conclusion 

This  section  has  shown  the  following. 

•  Although  flow  graph  parsing  is  exponential  in  the  worst  case,  it  is  feasible  to  apply  it 
to  practical  partial  program  recognition.  Structural  (node-type  and  edge  connection) 

*  Perfectly  counting  the  number  of  instances  of  a  node-type  is  no  easier  than  recognition  itself. 
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constraints  as  well  as  program  domain-specific  constraints  (e.g.,  co-occurrence)  are 
able  to  control  the  complexity  in  practice. 

•  The  type  of  node  ordering  imposed  on  the  right-hand  side  nodes  of  rules  affects  the 
parser’s  efficiency.  Strict  node  orderings  focus  the  search,  generating  fewer  partial 
analyses  and  duplicate  items  than  partial  node  orderings.  This  reveals  a  trade-off 
between  efficiency  and  recognition  power.  The  choice  of  how  to  order  nodes  within 
a  strict  or  partial  node  ordering  also  affects  performance.  This  choice  can  be  made 
with  the  help  of  external  advice  or  heuristics.  It  may  need  to  dynamically  change  as 
parsing  proceeds. 

•  The  capability  of  generating  maximally-sized  partial  recognitions  of  cliches  (i.e.,  near- 
miss  recognition)  is  expensive.  Future  near-miss  recognition  capabilities  n  usi  take 
advantage  of  advice  and  automated  techniques  for  indexing  and  decomposition  to  be 
feasible.  These  techniques  can  be  interleaved  profitably  with  recognition,  rather  than 
being  performed  statically  beforehand. 
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Chapter  7 


Conclusions 


We  have  developed  and  studied  a  graph  parsing  approach  to  program  recognition  in  which 
programs  are  represented  as  attributed  flow  graphs  and  the  cliched  library  is  encoded  as  an 
attributed  graph  grammar.  Graph  parsing  is  used  to  recognize  cliches  in  the  code.  We  have 
demonstrated  that  this  graph  parsing  approach  is  a  feasible  and  useful  way  to  automate 
program  recognition. 

The  approach  has  two  key  features.  One  is  the  representation  shift  it  employs.  The 
other  is  its  exhaustive,  systematic,  but  flexible  control  strategy.  The  graph  representation 
is  able  to  suppress  many  common  forms  of  program  variation  which  hinder  recognition. 
This  enables  our  recognition  approach  to  be  robust  under  syntactic,  organizational,  and 
implementational  variation,  as  well  as  variation  due  to  delocalization,  unfamiliar  code,  and 
common  function-sharing  optimizations.  Difficulties  arise  when  a  program’s  data  and  con¬ 
trol  flow  are  implicit  or  derived  or  cannot  be  determined  statically. 

The  flow  graph  formalism  is  able  to  concisely  encode  algorithmic  and  data  aggregation 
cliches  whose  constraints  are  primarily  based  on  data  and  control  flow.  These  include 
not  only  general-purpose  programming  cliches,  but  also  cliches  specific  to  the  simulation 
domain.  Limitations  arise  in  capturing  loosely  constrained  cliches.  Although  the  flow  graph 
formalism  allows  us  to  encode  cliches  on  a  high  level  of  abstraction,  the  level  of  abstraction  is 
still  limited  by  the  amount  of  detail  that  must  be  specified  about  the  cliches  (e.g.,  operation 
types  and  arity,  dataflow  connections,  control  environment  relationships). 

In  studying  the  graph  parsing  approach,  we  have  experimented  with  two  real-world 
simulator  programs.  We  empirically  and  analytically  studied  the  computational  cost  of 
our  recognition  system  with  respect  to  these  programs.  We  have  found  that  although  our 
graph  parsing  algorithm  is  exponential  in  the  worst  case,  its  complexity  is  reduced  in  its 
practical  application  to  program  recognition.  Structural  (node-type  and  edge  connection) 
constraints  as  well  as  constraints  which  are  specific  to  the  program  recognition  application 
(e.g.,  co-occurrence)  improve  the  parser’s  performance  in  practice.  Section  7.1  discusses  the 
need  for  more  empirical  study. 

Section  7.2  discusses  some  open  research  issues  that  have  not  yet  been  fully  explored. 
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An  important  future  goal  is  to  complement  our  code-driven  technique  with  an  expectation- 
driven  technique  that  provides  guidance  based  on  such  knowledge  as  the  program's  goals, 
problem  domain,  and  documentation.  With  its  flexibility,  our  recognition  architecture  forms 
a  seed  for  this  future  hybrid  program  understanding  system.  It  can  make  use  of  advice  and 
guidance  from  external  agents.  In  Section  7.2.5,  we  summarize  our  observations  of  typical 
forms  of  advice  that  would  be  helpful  to  our  recognition  system  in  controlling  its  complexity 
and  its  search  for  cliches. 

Section  7.3  gives  a  comparative  summary  of  related  work  in  program  recognition.  Fi¬ 
nally,  in  Section  7.4,  we  briefly  discuss  applications  of  program  recognition  and  of  our 
parsing  formalism  in  general. 

7.1  Empirical  Studies 

Our  study  is  a  step  toward  understanding  a  particular  recognition  technique  in  the  context 
of  real-world  programs.  It  tries  to  break  out  of  the  “toy”  program  rut.  Our  example 
programs  are  medium-sized  and  not  written  by  us.  They  start  to  give  some  indication  of 
what  is  typical  in  terms  of  characteristics  of  real-world  programs.  They  contain  domain- 
specific  cliches  as  well  as  general  utility  cliches.  They  also  contain  unfamiliar  code.  This 
allows  us  to  study  the  ability  of  our  parsing-based  technique  to  perform  various  types  of 
partial  recognition. 

However,  it  is  important  to  keep  the  findings  of  our  empirical  studies  with  just  two 
programs  in  perspective.  We  have  made  some  general  observations  that  we  expect  to  be  true 
of  programs  and  libraries  other  than  those  studied  here.  For  example,  we  point  out  general 
classes  of  variation  that  are  handled,  which  types  of  constraints  are  effective  in  improving 
performance,  and  situations  in  which  partial  recognition  can  occur.  On  the  other  hand,  we 
have  also  made  specific  observations  about  recognizing  these  programs  using  the  current 
library.  For  example,  we  observed  that  recognition  by  graph  parsing  can  be  done  efficiently 
in  practice.  We  also  discuss  weaknesses  of  our  representation  and  approach,  but  only  those 
that  we  encountered  in  our  study.  This  is  not  a  complete  list.  These  are  interesting  only  if 
these  programs  and  the  library  are  typical. 

Our  example  programs  are  still  small,  relative  to  real-world  programs  in  the  software 
industry.  There  are  bound  to  be  issues  of  scaling  up  to  large  programs  that  have  not  yet 
been  encountered.  More  empirical  studies  are  needed  to: 

•  expand  and  refine  the  cliche  library, 

•  identify  more  classes  of  variation  that  can  or  cannot  be  tolerated, 

•  determine  how  severe  and  common  the  limitations  are  that  we  have  pointed  out, 

•  identify  other  factors  that  affect  efficiency, 

•  determine  if  our  experiences  with  good  performance  were  lucky  or  typical  and. 
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•  evaluate  the  ability  of  the  existing  system  to  recognize  new  programs. 


7.2  Future 

This  section  discusses  areas  in  which  additional  research  is  needed. 

7.2.1  Multiple  Recursion 

Currently,  GRASPR  can  represent  and  recognize  singly-recursive  programs.  In  the  future, 
we  will  extend  its  attribute  language  to  capture  the  control  flow  information  of  multiply 
recursive  programs  as  well.  This  involves  a  straightforward  generalization  of  recursion 
information  triples  to  hold  more  than  one  feedback-ce  -  one  for  each  recursive  call.  To 
express  constraints  on  the  control  environment  attributes  of  these  programs,  we  will  need 
new  ways  of  referring  to  particular  feedback-ces.  We  can  no  longer  refer  simply  to  the 
“feedback-ce  in  the  innermost  recursion”  containing  a  particular  operation  or  test.  We 
may  need  to  identify  common  forms  of  multiple  recursions,  such  as  the  familiar  binary  tree 
recursion,  in  which  the  feedback-ces  are  related  in  standard  ways.  Then  individual  feedback- 
ces  can  be  referred  to,  based  on  their  relationship  to  others  in  the  multiple  recursion. 

In  addition,  more  research  is  needed  to  extend  the  temporal  abstraction  techniques  to 
abstract  multiply  recursive  programs.  There  may  be  some  common  types  of  multiple  recur¬ 
sion  for  which  temporal  abstraction  is  a  straightforward  generalization  of  the  techniques  for 
singly  recursive  programs.  For  example.  Rich  [110]  (Section  9.4)  briefly  discusses  temporal 
abstraction  of  binary  tree  recursions.  In  these  programs,  the  feedback-ces  are  the  same  con¬ 
trol  environment.  Other  programs  seem  not  to  be  amenable  to  temporal  abstraction,  such 
as  those  in  which  one  feedback-ce  is  C  the  other.  (This  arises  when  two  or  more  functions 
are  mutually  recursive  and  one  calls  itself,  as  in  the  familiar  Evaluate/ Apply  recursion.) 

Because  the  current  implementation  of  GRASPR  is  not  able  to  translate  multiply-recursive 
programs  into  meaningful  attributed  flow  graphs,  we  selectively  flattened  the  Evaluate/ Apply 
recursion  within  PiSin  to  avoid  generating  more  than  one  recursive  call.  During  the  trans¬ 
lation  of  the  program  to  a  plan,  we  specifically  advised  that  the  box  representing  the  call 
to  the  function  Evaluate  not  be  expanded  into  a  flow  graph  representing  the  function’s 
body.  The  resulting  flow  graph  contained  only  one  recursive  call,  (in  the  iterative  mapping 
of  Evaluate  over  a  list  of  Argunents  to  which  an  operation  is  to  be  applied).  The  function 
Evaluate  in  PiSia  corresponds  to  what  we  would  like  to  recognize  as  the  “Evaluate”  cliche. 

7.2.2  Interfacing  with  Other  Recognition  Techniques 

Recall  from  Section  5.2.3  that  we  had  difficulty  encoding  the  Evaluate  cliche,  due  to  its 
loose  constraints  on  data  and  control  flow.  Suppose  that  we  not  only  advise  GRASPR  not  to 
expand  the  node  representing  the  call  to  Evaluate,  but  we  also  specify  that  it  is  an  instance 
of  the  “Evaluate”  cliche.  (Normally  when  a  user  specifies  that  a  function  is  not  to  be 
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expanded  whose  name  happens  to  be  a  non-terminal  in  the  grammar,  GRASPR  systematically 
renames  the  function.  We  specify  that  the  function  is  an  instance  of  the  “Evaluate”  cliche 
by  overriding  this  renaming  and  labeling  the  node  “Evaluate.”) 

This  can  be  seen  as  a  way  to  use  results  from  another  recognition  technique  (in  this 
case,  performed  by  people),  which  applies  more  flexible  constraints  and  can  recognize  the 
body  of  Evaluate  as  the  “Evaluate”  cliche.  In  other  words,  GRASPR  uses  results  from  another 
recognition  technique  in  the  form  of  an  already  reduced  non-terminal  “Evaluate”  which  the 
other  technique  inserted  into  the  flow  graph  representing  the  program. 

An  alternative  way  for  GRASPR  to  use  recognition  results  from  other  techniques  is  for  these 
techniques  to  create  items  representing  the  recognition  results  and  add  them  directly  to 
GRASPR's  parser  agenda.  For  example,  rather  than  directly  relabeling  the  node  representing 
the  call  to  Evaluate,  a  complete  item  can  be  created  for  the  “Evaluate”  non-terminal  and 
added  to  the  parser’s  agenda.  This  has  the  advantage  that  the  program  is  not  destructively 
modified  by  the  insertion  of  the  already-reduced  non-terminal. 

7.2.3  Disambiguating  Data  Structure  Operation  Instances 

GRASPR  has  been  designed  to  exhaustively  and  algorithmically  recognize  all  cUches  in  a 
program.  It  does  not  employ  global  consistency  checks  to  rule  out  some  analyses  or  to 
disambiguate  multiple  views  of  the  same  part  of  a  program.  Its  recognition  process  is 
“monotonic”  in  that  new  recognitions  cannot  invalidate  previously  recognized  structures. 
Recognition  of  one  cliche  does  not  depend  on  the  failure  to  recognize  another  cliche. 

There  are  two  main  reasons  for  this.  One  is  that  the  code-driven  parsing  approach  is  not 
best  suited  to  perform  the  disambiguation  of  multiple  views  or  global  consistency  checks. 
These  should  be  done  by  a  higher-level  control  mechanism  that  has  access  to  information 
other  than  the  program’s  data  and  control  flow.  It  may  have  expectations  about  which 
interpretations  are  most  likely.  Also,  the  parsing  approach  does  relatively  local  constraint 
checking.  All  consistency  checks  and  disambiguation  refer  to  individual  instances  of  cliches 
that  are  parts  of  some  larger  cliche.  A  higher  level  mechanism  can  quantify  over  cliche 
instances  that  are  not  explicitly  related  by  being  part  of  some  larger  cliche. 

The  second  reason  that  GRASPR  generates  multiple,  possibly  ambiguous  analyses  is  that 
sometimes  multiple  views  are  useful  in  understanding  a  program.  A  higher-level  control 
mechanism  may  require  different  views  at  different  times,  depending  on  how  the  recognition 
results  are  being  used. 

The  interaction  between  GRASPR  and  a  higher-level  control  mechanism  would  be  partic¬ 
ularly  profitable  in  the  recognition  of  aggregate  data  cliches.  Data  cliches  are  recognized 
by  recognizing  operations  on  them.  These  operations  form  groups,  called  “suites,”  each  of 
which  represents  a  globally  consistent  set  of  operations  with  respect  to  some  data  structure. 
For  example.  Figure  7-1  shows  four  different  consistent  pairs  of  operations  for  inserting  and 
extracting  elements  from  an  indexed  sequence.  Each  of  these  represent  valid  operations  to 
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be  used  together  in  implementing  a  stack,  since  they  maintain  stack  discipline.  Each  pair 
is  a  suite. 

When  GRASPR  recognizes  an  individual  cliched  data  structure  operation,  it  reports  the 
recognition  of  the  operation  and  the  data  cliche.  Some  of  these  may  be  locally  ambiguous. 
For  example,  zerop  and  null  can  be  empty  tests  for  a  variety  of  cliched  data  structures.  Also, 
some  recognitions  might  not  be  globally  consistent  with  the  recognition  of  other  operations 
on  the  same  data  elsewhere  in  the  program.  For  example,  recognizing  one  operation  from  a 
suite  in  Figure  7-1  does  not  necessarily  mean  a  Stack  is  being  used  in  the  program.  Another 
access  or  update  to  this  same  aggregate  data  structure  elsewhere  in  the  program  might  use 
an  operation  from  another  suite. 

GRASPR  does  not  attempt  to  disambiguate  recognitions  of  data  structure  operations.  Nor 
does  it  globally  check  that  the  data  that  has  been  recognized  as  the  data  cliche  is  always 
operated  upon  by  operations  in  the  same  suite.  The  main  reason  is  that  GRASPR  is  not  the 
one  best  suited  for  this  task. 

It  is  difficult  to  do  these  things  in  the  flow  graph  parsing  framework,  based  only  on  the 
data  and  control  flow  of  the  program.  This  is  because  instances  of  operations  that  act  on  the 
same  aggregations  of  data  are  often  difficult  to  group  together,  in  order  to  apply  consistency 
constraints  (i.e.,  check  that  they  are  all  in  the  same  suite).  As  we  discussed  earlier,  data  and 
control  flow  cannot  always  be  completely  determined  or  made  explicit.  So,  the  operations 
are  not  always  connected  directly  by  dataflow.  It  may  be  possible  to  uncover  direct  dataflow 
in  some  cases  (e.g.,  implicit  aggregation  might  be  made  explicit).  However,  often  aggregate 
data  structures  are  collected  in  primitive  data  structures  (e.g.,  lists  or  arrays)  which  do  not 
represent  implicit  aggregations.  (For  example,  PiSin’s  *Evant-Qu«ue*  is  a  homogeneous  list 
of  Events.)  For  these,  the  connections  between  operations  on  the  aggregate  structures  must 
be  derived. 

In  addition,  negative  constraints,  such  as  that  no  other  operations  beside  those  in  some 
suite  act  on  certain  pieces  of  data,  are  difficult  to  check  in  our  recognition  framework.  This 
is  particularly  true  when  parts  of  the  program  are  not  available  for  analysis.  For  example, 
in  PiSin,  the  function  lext-Instruction  takes  a  user-defined  data  structure  Task  (which 
corresponds  to  the  EXECUTIOI-COITEXT  data  cliche)  and  fetches  an  IISTRUCTIQI  from  an 
array  of  llSTRUCTIOIs  nested  within  the  Task  data  structure.  The  function  uses  the  current 
integer  value  of  the  Task’s  “IP”  part  (which  stands  for  “Instruction- Pointer”)  to  index  into 
the  array.  It  then  increments  the  “IP”  part.  GRASPR  recognizes  this  function  as  a  “Stack- 
Pop.”  However,  in  the  machine  operation  simulation  functions,  which  are  given  as  input  to 
PiSiH,  the  “IP”  part  of  a  Task  is  sometimes  updated  to  an  arbitrary  value  (in  the  code  for 
simulating  branching  operations),  rather  than  being  incremented  or  decremented. 

Disambiguation  and  preferring  recognitions  may  be  done  more  easily  by  a  higher-level 
control  mechanism  which  has  access  to  other  information  about  the  program.  For  example, 
user-defined  part  names  provide  a  powerful  clue  to  which  structures  an  operation  is  acting 
upon.  It  is  often  the  case  that  the  operations  acting  on  data  that  was  selected  using  the 
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Figure  7-1:  Four  ways  of  implementing  Stack-Push  and  Stack-Pop  with  the  Stack  imple¬ 
mented  as  an  Indexed- Sequence. 


same  set  of  part  names  or  generating  data  that’s  always  stored  in  the  same  set  of  part  names, 
are  the  only  ones  used  to  access  or  change  those  parts.  Mnemonic  variable  names  (including 
synonyms)  and  stylistic  conventions  (e.g.,  module  decomposition)  can  also  be  a  good  source 
of  expectations  about  how  operations  should  be  grouped.  This  information  must  be  used 
heuristically  and  non-monotonically.  (Section  4.2.3  discusses  an  initial  attempt  to  map 
user-defined  data  structure  and  part  names  to  cliched  structure  names.  However,  these 
mappings  are  not  always  complete  or  unambiguous.) 

When  portions  of  a  program  are  not  available  for  analysis,  there  may  be  other  informa¬ 
tion  available  about  the  interface  between  the  unavailable  code  and  the  rest  of  the  program, 
such  as  which  functions  of  the  program  are  called  and  which  new  data  structures  are  cre¬ 
ated.  This  information  can  be  used,  for  example,  to  determine  that  the  “IP”  part  of  a  Task 
is  not  always  updated  using  increment  or  decrement,  but  can  be  given  an  arbitrary  integer 
value.  The  recognition  process  can  be  seen  as  giving  as  output  the  cliches  recognized  and  a 
set  of  assumptions  or  invariants  on  which  the  recognition  of  those  cliches  is  dependent. 

7.2.4  Side  Effects  to  Mutable  Data  Structures 

We  studied  the  recognition  of  aggregate  data  structures,  independent  of  issues  concern¬ 
ing  side  effects  to  mutable  data  structures.  In  order  to  do  this,  we  manually  translated 
our  example  programs  to  pure  (functional)  versions  and  recognized  pure  cliches  in  them. 
Fortunately,  the  translation  was  straightforward  and  much  of  it  may  be  automatable. 

An  open  problem  for  the  future  is  dealing  with  programs  that  contain  mutable  data 
structures  and  destructive  operations  on  them.  The  problem  is  modeling  the  dataflow 
correctly  in  representing  our  programs  as  dataflow  graphs.  This  is  complicated,  of  course, 
by  aliasing.  While  we  will  not  be  able  to  automatically  resolve  all  aliasing,  it  seems  possible 
to  use  recognition  to  uncover  common,  stereotypical  aliasing  patterns.  Complex  aliasing 
patterns  are  not  the  norm  [126,  127]. 

If  recognition  is  interleaved  with  dataflow  analysis,  aliasing  patterns  might  be  .ecognized 
and  used  to  help  correctly  translate  a  destructive  operation  into  its  non-destructive  version. 

There  are  two  main  classes  of  mutations  to  mutable  data  structures: 

1.  mutations  to  fixed,  named  parts  (e.g.,  (s«tf  (queue-head  queue)  nev-head)). 

2.  mutations  to  a  “derived”  part  (e.g.,  searching  through  a  list  for  an  element  with  some 
property  or  satisfying  some  predicate  and  then  deleting  that  element). 

When  a  change  is  made  to  a  fixed,  named  part  of  a  data  structure,  this  destructive 
assignment  should  be  replaced  with  non-destructive  code  which  creates  a  new  data  structure 
containing  the  new  value  for  the  part  and  the  old  values  for  the  rest  of  the  parts.  It  must 
also  recursively  create  new  versions  of  the  data  structures  within  which  this  data  structure 
is  nested.  For  example,  consider  the  following  destructive  operation  which  updates  the  Ti«e 
part  of  a  lode  data  structure,  which  is  the  value  of  the  lode  part  of  a  given  Task. 
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(defun  Set-Tine-Ol  (Task  les-Tiae) 

(setf  (lode-Tiffle  (Task-lode  Task)) 

Mew-Time)) 

The  following  non-destructive  translation  of  this  operation  creates  a  copy  of  the  Task’s 
lode,  but  giving  the  Time  part  the  lew-Time.  It  also  creates  a  copy  of  the  Task,  with  the 
new  lode  as  its  lode  part.  It  also  returns  the  new,  updated  structures  so  that  the  caUers  of 
Set-Time-Of  can  use  them. 

(defun  Set-Time-Of  (Task  lew-Time) 

(let  ((Task-lode  (Task-lode  Task))) 

(setq  Task-lode  (Make-lode  :Time  lew-Time 

:ID  (lode-ID  Task-lode) 

.‘Segments  (lode-Segments  Task-lode) 
riodals  (lode-lodals  Task-lode))) 

(values  lew-Time 
Task-lode 

(Make-Task  .‘Handler  (Task-Handler  Task) 

:lode  Task-lode 
: Segment  (Task-Segment  Task) 

:IP  (Task-IP  Task) 

.‘Status  (Task-Status  Task))))) 

For  nesting  of  fixed,  named  parts,  it  may  be  possible  for  the  symbolic  evaluator  to  keep 
track  of  how  the  structures  are  nested.  The  symbolic  evaluator  can  treat  the  variables  bound 
to  data  structures  as  bound  to  sets  of  “part  variables,”  which  are  bound  either  to  regular 
values  or  to  other  data  structures  (i.e.,  sets  of  part  variables).  When  a  part  is  modified,  the 
part  variables  are  traced  backward  to  see  what  other  objects  are  modified. 

Aliasing  is  harder  to  uncover  when  mutations  are  made  to  derived  parts  because  it’s 
harder  to  prove  that  the  part  changed  is  the  same  as  the  part  pointed  to  by  something 
else.  (In  other  words,  the  “nesting”  relationships  are  derived.)  However,  these  types  of  side 
effects  usually  occur  in  cliched  operations,  such  as  searching  through  a  list  and  modifying 
the  element  found  or  changing  all  elements  of  an  array.  If  we  heuristically  (and  nonmono- 
tonically)  assume  that  the  aliasing  pattern  is  localized  and  standard,  we  can  transform  the 
cliched  side  effecting  operation  to  the  functional  version. 

For  example,  a  common  aliasing  pattern  occurs  in  splicing  an  element  into  a  recursive 
data  structure,  such  as  a  list.  An  example  is  in  the  following  function  which  is  used  in 
PiSim  to  enqueue  events  on  an  event  queue  (which  is  a  priority-queue). 

(dwfun  Insert-Event  (lew-Event  Event-Queue) 

(if  (or  (null  (cdr  Event-Queue)) 

(<  (Event-Time  lew-Event) 

(Event-Time  (second  Event-Queue)))) 

; ;  push  lew-Event  on  (cdr  Event-Queue) 
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(rplacd  Event-Queua 

(cons  Hsv-Event  (cdr  Event-Queue))) 

(Insert-Event  Hev-Event  (cdr  Event-Queue)))) 

In  this  splice-in  operation,  the  program  “cdrs-down”  the  list  Event-Queue  until  it  finds  a 
spot  to  insert  the  element  Meu-Event.  Then  the  new  element  is  spliced  in  by  destructively 
modifying  the  cdr  of  the  current  list.  However,  the  current  list  is  not  only  pointed  to  by  the 
variable  holding  the  current  list,  but  also  by  the  cons  cell  at  the  end  of  the  sub-list  already 
passed.  This  aliasing  pattern  is  simple  and  localized  within  the  recursive  data  structure  and 
the  variables  used  in  the  splice-in  program.  It  is  very  common  in  our  example  programs. 

Suppose  GRASPR  recognized  the  pattern  of  cdr-ing  down  a  list  and  replacing  the  cdr 
(using  rplacd)  of  the  current  list  with  a  new  list  consisting  of  the  new  element  followed  by 
the  old  cdr  of  the  current  list.  Then  it  may  be  possible  to  replace  this  pattern  with  the 
following  non-destructive  version  in  which  the  side  effect  is  propagated  up  to  the  top  of  the 
data  structure. 

(dafun  Insert-Event  (leu-Event  Event-Queue) 

(if  (or  (null  (cdr  Event-Queue)) 

(<  (Event-Tine  Heu-Event) 

(Event-Time  (second  Event-Queue)))) 

(cons  (car  Event -Queue) 

(cons  lew-Event  (cdr  Event-Queue))) 

(cons  (car  Event-Queue) 

(Insert-Event  leu-Event  (cdr  Event-Queue))))) 

In  particular,  the  tail-recursive  destructive  program  is  replaced  with  a  recursive  non-destruc¬ 
tive  program  and  the  list  is  cdr’d  down  as  usual,  but  the  elements  passed  on  the  way  are 
remembered  in  the  stack  of  recursive  calls  and  are  used  to  create  a  copy  of  the  front  of  the 
list  on  the  way  back  out  of  the  recursion. 

Another  common  type  of  aliasing  involves  pooling  structures  which  contain  all  existing 
instances  of  some  type  of  data  structure.  For  example,  the  array  viodas*  contains  all  MODE 
structures.  When  a  part  “Time”  of  RODE  is  modified,  this  mutation  should  be  replaced  with 
non-destructive  code  that  not  only  creates  a  new  MODE,  with  the  new  value  for  the  part 
“Time,”  but  also  creates  a  new  aiodas*  array,  with  the  new  MODE  in  place  of  the  old. 

This  update  of  the  pooling  structure  requires  knowing  the  inverse  translation  of  an 
object  to  its  pooling  structure.  This  can  be  difficult  to  compute.  However,  we  found  that  in 
our  example  programs,  all  of  the  objects  contained  in  pooling  structures  had  a  part,  such  as 
an  “ID”  number  or  a  “Tag”  symbol,  that  held  an  index  into  the  pooling  structure.  A  useful 
form  of  advice  is  an  identification  of  all  pooling  structures  in  the  program  (which  is  usually 
easy  for  a  person  to  provide,  based  on  mnemonic  variable  names  and  documentation)  and  an 
inverse  mapping  (if  any)  from  the  objects  pooled  to  the  pooling  structure.  As  was  suggested 
for  dealing  with  variation  due  to  handles,  GRASPR  can  elicit  advice  about  pooling  structures 
by  recognizing  question-triggering  patterns.  (See  Section  5.2.1.) 
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7.2.5  Advising  GRASPR 

We  have  presented  a  recognition  architecture  that  has  a  flexible  control  structure  in  that  it 
can  accept  advice  to  help  control  its  complexity  and  to  guide  its  search  for  recognitions.  This 
advice  can  be  given  in  a  data-directed  way,  as  opposed  to  modifying  the  parsing  algorithm 
to  build  heuristics  into  the  system.  There  are  a  variety  of  “control  knobs”  and  parameters 
that  are  available  to  provide  GRASPR  with  guidance. 

•  Strict  versus  partial  node  orderings:  One  form  of  advice  that  can  be  given  to  control 
the  computational  complexity  of  the  recognition  system  is  a  specification  of  the  type  of 
node  ordering  that  should  be  imposed  on  the  right-hand  side  nodes  of  grammar  rules. 
Strict  node  orderings  are  cheaper,  since  they  generate  fewer  partial  and  duplicate 
items.  However,  partial  node  orderings  provide  more  near-miss  information,  which  is 
important  in  dealing  with  buggy  programs  and  in  eliciting  more  advice. 

•  Node  orderings:  Another  form  of  advice  is  the  choice  of  how  to  order  nodes  within 
a  strict  or  partial  node  ordering.  These  can  affect  the  order  in  which  constraints 
are  imposed,  so  that  stronger  constraints  are  imposed  early.  (For  example,  requiring 
salient  nodes  to  be  matched  first  imposes  strong  disambiguation  constraints  early.) 

•  Selection  of  items  from  agenda:  Procedures  can  be  provided  which  decide  which  items 
to  pull  from  the  current  agenda  and  process.  This  is  one  way  to  control  GRASPR’s  search 
strategy.  For  example,  certain  partial  items  might  be  pulled  from  the  agenda,  based 
on  which  part  of  the  input  program  they  have  started  to  match  or  based  on  how  much 
of  their  right-hand  sides  they  have  matched  already. 

•  Additional  monitors:  Special-purpose  monitors  can  be  defined  to  watch  the  chart  for 
particular  types  of  items  to  enter.  Additionally,  rules  for  question-triggering  patterns 
can  be  included  in  the  grammar  along  with  the  rules  for  cliches.  Monitors  can  watch 
for  these  patterns  and  then  interact  with  outside  agents.  Monitors  can  also  be  de¬ 
fined  to  watch  for  opportunities  to  “try-harder”  by  generating  alternative  views  or  by 
weakening  some  constraints  that  make  an  analysis  fail.  The  recursion  folding  monitor 
described  in  Section  4.2.2  is  an  example  of  monitoring  for  items  that  are  failing  certain 
constraints,  but  which  might  be  made  to  complete  by  forcing  certain  constraints  to 
be  satisfied.  The  tasks  set  up  by  chart  monitors  can  be  prioritized  so  that  those  that 
are  expensive  or  less  likely  to  be  effective  can  be  postponed  while  quick,  promising 
tasks  are  accomplished  first. 

•  Indexing  partial  analyses:  In  addition  to  indexing  into  the  chart  to  retrieve  successful 
recognitions,  it  is  possible  to  index  into  the  chart  to  retrieve  partial  analyses  that 
fail  certain  types  of  constraints.  It  is  also  possible  to  find  out  approximately  how 
far  the  recognition  of  some  cliche  has  gotten.  GRASPR  does  this  by  taking  the  non¬ 
terminal  representing  the  cliche  and  enumerating,  in  breadth-first  fashion,  the  non- 
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terminals  that  this  non-terminal  is  built  upon  in  the  grammar.  For  each  non-terminal, 
it  looks  up  all  successful  and  failed  recognitions  of  the  non-terminal  in  the  flow  graph 
representing  the  program.  It  cuts  off  the  breadth-first  traversal  whenever  a  successful 
or  failed  item  is  found  for  a  non- terminal.  These  are  collected  and  given  as  output. 
In  other  words,  this  finds  the  highest  roots  of  the  possible  sub-derivation  trees  that 
can  build  up  to  the  recognition  of  the  cliche’s  non-terminal.  This  currently  does  not 
use  any  information  about  the  location  of  the  recognized  non-terminals.  It  is  best  for 
high-level  cliches  whose  parts  occur  infrequently  in  the  input  flow  graph.  Failed  items 
contain  information  about  which  constraints  they  failed  to  satisfy.  This  is  useful  in 
determining  what  can  be  done  to  push  the  recognition  through. 

•  Partitioning  constraints:  Section  6.4.1  described  various  heuristics  for  decomposing 
a  program  into  partitions  which  can  be  used  to  focus  the  parser.  This  information 
can  be  used  by  augmenting  the  extendibility  criterion  with  a  binary  partitioning  con¬ 
straint.  This  requires  that  a  pair  of  complete  and  partial  items  that  are  candidates  for 
combination  represent  the  recognition  of  sub-flow  graphs  within  the  same  partition. 
Combination  attempts  that  fail  this  constraint  can  be  postponed,  rather  than  elimi¬ 
nated  altogether.  This  allows  certain  combinations  to  be  preferred  over  others,  while 
allowing  less  favorable  combinations  to  be  available  in  a  later  try-harder  phase.  The 
advantage  is  that  completeness  will  not  be  lost  due  to  heuristic  partitioning.  Also, 
the  partitioning  constraint  can  be  selectively  applied  on  a  rule-by-rule  basis  and  to 
particular  pairs  of  nodes  in  a  rule’s  right-hand  side. 

While  GRASPR  has  flexible  control  capabilities,  the  control  knobs  and  parameters  listed 
above  form  its  current  interface  for  accepting  advice.  More  work  is  needed  to  develop  a 
higher-level  interface  between  GRASPR  and  the  other  agents  it  will  interact  with  in  the  future 
hybrid  system. 

Other  forms  of  advice  that  are  useful  to  GRASPR  Include  indications  of  which  structures 
in  the  program  are  pooling  structures  (for  side  effect  analysis,  and  uncovering  the  use  of 
handles),  and  pointing  out  when  implicit  aggregation  and  manual  abstraction  are  being 
used.  These  might  be  elicited  during  recognition  (based  on  question-triggering  patterns)  or 
they  might  be  given  as  machine-readable  comments. 

For  GRASPR  to  intelligently  ask  questions  of  a  user  (e.g.,  based  on  recognizing  question¬ 
triggering  patterns),  it  must  be  able  to  refer  to  parts  of  the  source  text.  When  GRASPR 
represents  programs  as  attributed  flow  graphs,  it  suppresses  a  great  deal  of  detail.  Although 
the  information  is  still  around  in  annotations,  GRASPR  currently  has  only  limited  facilities 
for  efficiently  mapping  from  one  representation  to  another.  (For  example,  it  associates  sets 
of  variables  to  dataflow  edges.  It  can  also  recreate  small  expressions  in  the  program.) 

Additionally,  GRASPR  is  expected  to  interact  with  other  reasoning  components  in  the  fu¬ 
ture,  which  will  perform  such  things  as  conditional  simplifications,  reasoning  about  dataflow 
equalities,  and  data  structure  operation  disambiguation  and  consistency  checking.  Multiple 
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representations  of  the  program  (including  source  text)  will  need  to  be  maintained  for  GRASPR 
to  interface  with  these  other  components. 

Additional  Code-Based  Information  Sources 

Aside  from  eliciting  advice  from  an  external  agent,  some  additional  information  can  be 
gleaned  from  the  leftover  non-cliched  parts  of  the  program,  particularly  in  the  program’s 
error  checking  and  its  initialization  procedures. 

Error  Conditions.  Non-local  exits  are  currently  ignored.  (The  non-local  ontrol  flow 
they  represent  is  not  modeled.)  However,  error  conditions  could  be  a  useful  form  of  machine- 
readable  comment.  They  often  give  part  of  the  specification  for  the  program.  For  example, 
when  a  Handler  is  invoked  for  a  message  and  a  list  of  arguments,  PiSim  checks  whether 
exactly  the  right  number  of  arguments  were  given  to  the  handler: 

(vhen  (not  (=  (Handler-Arity  Handler)  (length  Argunents))) 

(error  "PiSim  error:  arity  mismatch")). 

If  a  cliche  is  being  looked  for  that  has  (length  Arguments)  as  a  subcomputation,  but 
the  program  uses  (Handler-Arity  Handler)  instead,  then  we  can  use  the  assertion  from  the 
error  condition  to  push  the  recognition  through. 

A  key  advantage  of  error  conditions  is  that  they  are  easier  to  process  and  more  up-to-date 
than  textual  comments. 

Initialization.  GRASPR  normally  does  not  recognize  computations  for  program  initializa¬ 
tion  or  reading  in  input,  since  these  are  usually  non-standard.  They  vary  with  the  way 
the  data  is  organized.  However,  we  can  extract  information  from  this  non-standard  code 
about  how  data  structures  are  organized.  For  example,  the  foUowing  code  for  Clear-Nodes 
tells  how  the  parts  of  a  Node  interact.  The  part  Nodals  of  a  node  is  a  key  into  the  node’s 
Segments  part,  which  is  a  hash  table.  The  elements  of  this  hash  table  are  Segment  data 
structures,  whose  Data  parts  are  arrays. 

(delun  Clear-lodes  () 

(loop  lor  lode  being  the  array-elements  ol  elodes* 
lor  lodals-IO  -  (lode-Iodals  lode) 

lor  lodals  =  (Hash-Lookup  (lode-Segments  lode)  lodeQs-ID) 

doing  (setl  (lode-Time  lode)  0) 

doing  (Clear-Hash-Table  (lode-Segments  lode)) 

doing  (Hash- Insert  (lode-Segments  lode)  lodals-ID  lodals) 

doing  (loop  vith  Data  =  (Segment-Data  lodals) 

lor  Index  Irom  0  below  (array-total-size  Data) 
doing  (setl  (arel  Data  Index)  ’Unbound)))) 
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7.3  Related  Work 


We  can  contrast  our  work  on  program  recognition  with  that  of  other  researchers  along 
several  lines.  This  section  focuses  mainly  on  the'distinctions  between  the  program  and  cliche 
representations  and  the  recognition  techniques  used.  Both  affect  how  well  the  recognition 
<^ystems  can  deal  with  variation,  allow  partial  recognition,  and  fit  into  a  hybrid  system. 

Our  work  is  also  distinguished  from  other  program  recognition  research  in  that  we  an¬ 
alyze  our  approach,  both  empirically  and  analytically.  Much  of  the  early  work  in  program 
recognition  provides  no  analysis  of  the  representations  or  techniques  used.  Some  of  the 
more  recent  research  includes  some  empirical  analysis  of  techniques.  They  typically  study 
the  accuracy  of  recognition  and  the  recognition  rates  over  sets  of  programs  (usually  stu¬ 
dent  programs  in  program  tutoring  applications)  [65,  95).  However,  with  the  exception  of 
Hartman’s  work  [55],  discussions  of  limitations  have  focused  mainly  on  practical  implemen- 
tational  limitations,  rather  than  on  general  limitations  of  the  approach.  They  also  do  not 
describe  how  additional  information  or  guidance  can  help. 

Our  recognition  work  can  also  be  compared  to  other  work  along  the  lines  of  the  types 
of  programs  and  cliches  recognized.  Our  recognition  system  is  able  to  recognize  structured 
programs  and  cliches  containing  conditionals,  loops  with  any  number  of  exits,  recursion, 
aggregate  data  structures,  and  simple  side  effects  due  to  assignments.  This  allows  GRASPR  to 
recognize  larger  programs  than  existing  recognition  systems.  It  also  enables  encoding  and 
recognition  of  domain-specific  cliches  as  weU  as  general-purpose  ones,  since  many  domain- 
specific  cliches  are  aggregate  data  structure  cliches.  With  the  exception  of  CPU  [84],  existing 
recognition  systems  cannot  handle  aggregate  data  structure  cliches  and  a  majority  do  not 
handle  recursion.  Talus  [95]  heuristically  handles  some  side  effects  to  lists  and  arrays. 
The  largest  program  recognized  by  any  existing  recognition  system  is  a  300-line  database 
program  recognized  by  CPU.  All  other  systems  work  with  programs  on  the  order  of  tens 
of  lines.  None  deal  with  domain-specific  cliches,  except  Laubsch’s  system  [81,  82].  Hart¬ 
man’s  UNPROG  [55]  is  the  only  system  that  has  demonstrated  recognition  of  unstructured 
programs. 

Our  earlier  work  on  the  “Recognizer”  [118,  144,  145]  is  typical  of  previous  approaches 
to  automating  program  recognition.  It  recognized  small,  contrived  example  programs,  on 
the  order  of  tens  of  lines.  Its  cliche  library  consisted  exclusively  of  general-purpose,  utility 
cliches.  The  Recognizer  could  deal  with  programs  containing  conditionals,  loops,  but  not 
regular  (non-tail)  recursion  or  data  aggregation.  Like  GRASPR,  it  used  a  dataflow  graph 
representation  for  programs  and  cliches,  but  it  employed  a  rigid  control  strategy.  (It  was 
based  on  a  subgraph  parsing  algorithm  that  evolved  from  Brotsky’s  algorithm.  See  Section 
3.5.)  The  development  of  the  Recognizer  was  a  feasibility  study  to  demonstrate  that  graph 
parsing  can  be  used  to  automate  recognition,  remove  many  types  of  variation,  and  create 
a  useful  description  of  a  program.  Our  current  work  moves  beyond  studying  feasibility 
by  analyzing  computational  costs,  studying  GRASPR’s  tolerance  (or  vulnerability)  to  various 
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types  of  variation,  identifying  limits  in  graph  grammar  expressiveness  for  programming 
cliches,  and  studying  how  GRASPR  can  fit  into  a  hybrid  understanding  system.  GRASPR  moves 
into  the  next  level  of  maturity  of  recognition  systems. 

7.3.1  Representation 

Johnson’s  PROUST  [65],  Ruth’s  system  [122],  Lukey’s  PUDSY  [87],  Looi’s  APROPOS2  [85] 
and  Allemang’s  DUDU  [4,  5]  operate  directly  on  the  program  text.  This  limits  the  variabil¬ 
ity  and  complexity  of  the  structures  that  can  be  recognized,  because  these  systems  must 
wrestle  directly  with  syntactic  variations,  performing  source-to-source  transformations  to 
twist  the  code  into  a  recognizable  form.  Most  of  these  systems’  effort  is  expended  trying  to 
canonicalize  the  syntax  of  the  program,  rather  than  concentrating  on  its  semantic  content. 
In  addition,  diffuse  cliches  pose  a  serious  problem. 

Because  the  types  of  patterns  searched  for  in  these  systems  are  sets  of  statements,  they 
limit  the  types  of  programs  in  which  they  can  be  found.  In  PUDSY,  the  group  of  statements 
matching  a  pattern  must  be  contiguous,  not  scattered  throughout  the  code.  Ruth’s  system 
translates  programs  into  a  Lisp-like  model  language  consisting  of  a  small  set  of  primitive 
operations.  This  representation  abstracts  away  information  about  which  particular  bind¬ 
ing  and  control  constructs  were  used.  However,  it  assumes  program  statements  are  totally 
ordered  (by  control  flow  as  well  as  dataflow),  rather  than  partially  ordered  (by  data  de¬ 
pendencies  only).  This  prevents  the  system  from  recognizing  that  two  programs  that  differ 
only  in  the  order  of  execution  of  two  independent  statements  are  the  same  modulo  this 
difference. 

PROUST  uses  plan-difference  rules  to  account  for  mismatches  between  the  cliches  (which 
Johnson  calls  “plans”)  it  is  looking  for  and  the  actual  text  of  the  program.  These  may  allow 
the  code  to  be  transformed  into  an  equivalent  syntactic  variation  of  the  code  or  they  may 
trigger  the  identification  of  a  bug  as  being  one  listed  in  its  bug  catalog.  Thus,  allowable 
variations  in  code  are  limited  to  those  accounted  for  by  plan-difference  rules.  To  be  flexible 
and  powerful,  PROUST  must  have  a  large  knowledge  base  of  these  rules.  The  number  of 
rules  could  be  reduced,  however,  if  a  more  abstract  representation  for  programs  were  used, 
or  if  the  semantic  equivalence  of  the  mismatched  code  with  the  cliche  could  be  confirmed 
using  a  theorem  prover  [95]  or  symbolic  evaluation  [87]. 

Allemang’s  DUDU  (which  stands  for  Debugging  Using  Device  Understanding)  [4,  5] 
attaches  information  about  a  program’s  functional  semantics  to  its  representation.  DUDU’s 
representation  of  cliches  extends  Johnson’s  text-based  plan  representation  [65]  to  include 
not  only  goals  and  components  for  achieving  them,  but  also  causal  links  to  show  how 
the  components  achieve  the  goals.  For  example,  an  iterative  cliche  would  be  represented 
as  a  program  template  of  statements  with  assertions  that  the  loop  invariants  hold  after 
initialization,  after  each  iteration,  and  when  the  loop  terminates,  as  weU  as  assertions  that 
the  terminating  conditions  hold  when  the  loop  terminates. 
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The  functional  representation  specifies  which  parts  of  a  cliched  program’s  proof  of  cor¬ 
rectness  are  supported  by  which  parts  of  its  plan  representation.  (AUemang  uses  the  func¬ 
tional  representation  language  of  Sembugamoorthy  and  Chandrasekaran  [125].)  A  key  ben¬ 
efit  gained  by  this  representation  is  that  it  provides  useful  information  that  can  make  it 
easier  to  tolerate  variation  in  how  a  function  is  achieved.  Because  it  explicitly  describes  the 
purpose  or  function  of  each  part  of  a  cliche  in  the  context  of  a  larger  proof  of  correctness,  if 
some  part  of  the  cliche  does  not  match  the  program,  the  functional  representation  describes 
the  function  of  that  part.  It  may  then  be  possible  to  prove  that  the  mismatched  portion 
of  the  program  stiU  achieves  this  function.  How  much  variation  can  be  tolerated  depends 
on  the  generality  of  the  associate  proof  (e.g.,  how  generally  are  the  loop  invariants  and 
terminating  conditions  expressed). 

Reasoning  about  functional  semantics  in  this  way  requires  that  the  recognition  system 
know  the  intended  function  or  purpose  of  a  program.  Like  Proust,  DUDU  was  developed 
in  the  context  of  debugging  student  programs,  where  this  information  is  readily  available. 
However,  for  purely  code-driven  recognition  (as  is  usually  required  in  maintenance  situa¬ 
tions),  near-miss  recognition  of  cliches  must  first  be  performed.  This  can  be  used  to  help 
generate  expectations  about  which  subset  of  cliches  to  try  harder  to  recognize  by  prov¬ 
ing  that  the  functions  of  their  unrecognized  parts  are  still  being  achieved.  However,  this 
requires  overcoming  the  expense  of  near- miss  recognition  (see  Section  6.2.7)  and  defining 
preferences  among  near- misses. 

One  drawback  of  AUemang’s  representation  is  that  it  is  limited  by  its  text-based  rep¬ 
resentation  of  cliches  and  programs.  Since  it  directly  extends  Proust’s  text-based  repre¬ 
sentation,  it  inherits  Proust’s  problems  with  syntactic  variation.  This  can  be  avoided  by 
using  a  graph  representation,  such  as  ours,  as  the  base  upon  which  to  attach  the  functional 
information  (see  [4],  Section  7.4). 

Adam  and  Laurent’s  LAURA  [2]  represents  programs  as  graphs,  thereby  allowing  some 
syntactic  variability.  However,  the  graph  representation  differs  from  ours  in  that  dataflow 
is  represented  implicitly  in  the  graph  structure.  Nodes  represent  assignments,  tests,  and 
input/output  statements,  rather  than  simply  operations;  arcs  represent  only  control  flow. 
Because  of  this,  LAURA  must  rely  on  the  use  of  program  transformations  to  “standard¬ 
ize”  the  dataflow.  (GRASPR  need  not  perform  these  transformations,  since  the  flow  graph 
representation  shows  net  dataflow  explicitly.)  LAURA  debugs  a  program  by  comparing  it 
to  a  given  correct  implementation,  called  the  progmm  model,  of  the  algorithm  which  the 
program  is  supposed  to  be  using.  Only  the  program  model’s  implementation  is  recognizable 
in  the  program;  no  implementational  variation  is  allowed. 

The  system  proposed  by  Fickas  and  Brooks  [43]  uses  a  Plan  Calculus-like  notation, 
called  program  building  blocks  (pbbs),  for  cliches.  Each  pbb  specifies  inputs,  outputs,  post¬ 
conditions,  and  pre-conditions.  (Pbbs  are  equivalent  to  Water’s  segments  [137].)  The 
structure  of  the  library  is  provided  by  implementation  plans,  which  are  like  implementation 
overlays  in  the  Plan  Calculus.  They  decompose  non-primitive  pbbs  into  smaller  pbbs,  linked 
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by  dataflow  and  purpose  descriptions.  However,  on  the  lowest  level  of  their  library  (unlike 
that  used  by  GRASPR),  the  pbbs  are  mapped  to  language- specific  code  fragments  which  are 
matched  directly  against  the  program  text.  Thus,  this  system  also  falls  prey  to  the  syntactic 
variation  problem. 

Murray’s  Talus  [95]  uses  an  abstract  frame  representation  (caUed  an  E-frame)  for  pro¬ 
grams.  The  slots  of  an  E-frame  contain  information  about  the  program,  including  the  type 
of  recursion  used,  the  termination  criteria,  and  the  data  types  of  the  inputs  and  outputs. 
This  representation  helps  abstract  away  from  the  syntactic  code  structure  by  extracting 
semantic  features  from  the  program,  allowing  greater  syntactic  variability.  However,  listing 
all  characteristics  of  the  code  in  E-frame  slots  fails  to  expose  constraints  (such  as  dataflow 
constraints)  in  a  way  that  facilitates  recognition. 

Bertels  [11]  defines  a  broad  hierarchy  of  programming  knowledge  with  programming 
primitives  on  the  bottom,  problem  solving  strategies  at  the  top  and  cliches  at  successively 
higher  levels  of  abstraction  in  between.  The  problem  solving  strategies  are  strategies  for 
debugging  (e.g.,  slicing),  program  understanding  (e.g.,  conjecturing),  and  program  synthe¬ 
sis  (e.g.,  divide  and  conquer).  Each  level  builds  on  the  levels  below  it.  Bertels’  model 
of  programming  knowledge  also  includes  rules  of  programming  discourse  [128]  which  are 
applicable  at  all  levels  in  the  hierarchy. 

To  represent  cliches,  Bertels  uses  conceptual  schemes,  which  are  essentially  hierarchical 
semantic  networks.  Like  our  flow  graph  formalism,  these  schemes  focus  on  data  and  control 
flow  constraints.  Each  conceptual  scheme  hierarchically  represents  the  decomposition  of 
some  goal  into  subgoals  and  the  methods  for  achieving  them.  They  can  also  represent 
multiple  alternative  methods  for  achieving  some  goal.  Their  hierarchical  structure  resembles 
the  organization  of  cliches  in  our  library,  as  shown  in  Figures  2-1,  2-3,  and  2-4.  Additional 
information  included  in  the  conceptual  scheme  identifies  the  roles  and  various  characteristics 
of  the  pieces  of  data  used  by  the  methods  (e.g.,  that  some  piece  is  a  divisor  and  has  a 
minimum  value  of  0).  Dataflow  connections  are  not  explicitly  represented. 

At  the  lowest  level,  conceptual  schemes  are  built  out  of  “Semantically  Augmented  Pro¬ 
gramming  Primitives”  (or  SAPPs).  These  are  programming  primitives  that  have  been  clas¬ 
sified  in  terms  of  their  role  in  the  program  on  a  slightly  higher  level  of  abstraction.  For 
example,  an  assignment  might  be  viewed  as  an  increment  and  a  predicate  can  be  seen  as 
a  loop  exit  test  or  a  filter.  In  general,  it  is  difficult  to  unambiguously  make  this  classifi¬ 
cation  of  primitives,  but  Bertels  uses  a  very  restricted  unambiguous  set  of  SAPPs.  These 
correspond  to  our  lowest  level  cliches. 

Letovsky’s  Cognitive  Program  Understander  (CPU)  [84]  uses  a  lambda  calculus  represen¬ 
tation  for  programs.  CPU  uses  transformations  to  standardize  (i.e.,  make  more  canonical) 
the  program’s  syntax  and  to  simplify  expressions.  However,  Letovsky  generalizes  canonicaJ- 
ization  to  be  the  entire  means  of  program  recognition.  Canonicalization  involves  not  only 
standardizing  the  syntax  of  the  program,  but  also  standardizing  the  expression  of  standard 
plans  (i.e.,  cliches)  in  the  program.  Recognizing  a  plan  that  achieves  a  particular  goal  is 
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equivalent  to  canonicalizing  the  plan  expression  to  the  goal.  So,  CPU  uses  a  single,  general 
transformation  mechanism  for  dealing  with  syntactic  variability  and  for  recognition.  In 
contrast,  GRASPR  uses  a  special-purpose  mechanism  (the  program-to-flow  graph  translator) 
to  factor  out  most  of  the  syntactic  variability  before  recognition  is  attempted. 

For  CPU  to  localize  cliches  in  a  lambda  expression  so  that  a  transformation  rule  can 
apply,  numerous  transformations  need  to  be  made  to  copy  subexpressions  and  move  them 
around  the  program.  For  example,  function-inside-if  ([84],  p.l09)  copies  functional  appli¬ 
cations  to  all  branches  of  a  conditional  and  stored  expressions  are  copied  to  replace  each 
corresponding  variable  reference.  This  is  expensive  both  in  the  time  it  takes  to  apply 
transformations  and  in  the  exponential  space  blow-up  that  occurs  as  a  result.  In  our  repre¬ 
sentation,  cliches  are  localized  in  the  connectivity  of  the  flow  graphs.  In  addition,  the  ability 
of  the  parser  to  generate  multiple  analyses  enables  GRASPR  to  recognize  two  cliches  whose 
implementations  overlap  without  first  copying  the  parts  that  are  shared,  as  CPU  must. 

Another  difference  arising  from  the  use  of  the  lambda  calculus  formalism  is  in  the  types  of 
cliches  that  can  be  expressed.  The  components  of  a  cliche  expressed  in  the  lambda  calculus 
must  be  connected  in  terms  of  dataflow  interaction.  CPU’s  assumpti.n  is  that  cliches  are 
tied  together  by  dataflow,  otherwise  there  is  nothing  bringing  the  results  together.  (One 
exception  to  this  is  a  data  abstraction  plan  in  which  a  non-lambda-calculus  tupling  operation 
is  used  to  bind  together  multiple  dataflows  into  a  single  value.)  In  flow  graph  grammar  rules, 
cliches  can  contain  components  that  are  disconnected  in  terms  of  dataflow,  but  which  are 
tied  together  by  other  constraints,  such  as  control  flow. 

There  is  also  a  difference  between  CPU’s  transformations  and  our  grammar  rules.  Simple 
transformations  are  similar  to  grammar  rules,  but  complex  transformations  often  specify 
procedurally  how  to  change  the  program.  For  example,  the  loop  analysis  transformation 
is  procedural.  Loop  cliches,  such  as  Altering  out  certain  elements  from  a  list  that  is  being 
enumerated,  are  transformed  using  a  recursion  elimination  technique  in  which  the  patterns 
of  dataflow  in  a  loop  are  analyzed  and  classified  as  stream  expressions.  Then,  based  on 
dataflow  dependencies,  occurrences  of  primitive  loop  plans  are  identified  and  composed  to 
represent  the  loop.  (This  is  Waters’  temporal  abstraction  technique  [137,  138].)  Our  rules, 
on  the  other  hand,  are  declarative.  They  can  be  used  in  both  synthesis  (generation)  and 
analysis  (parsing). 

Laubsch  and  Eisenstadt  [81,  82]  and  Lutz  [88]  use  variations  of  the  Plan  Calculus. 
Laubsch  smd  Eisenstadt ’s  system  differs  from  GRASPR  in  the  recognition  technique  it  employs. 
Lutz  proposes  using  a  program  recognition  approach  similar  to  ours.  See  Section  3.6  for 
the  relationship  of  Lutz’s  “flowgraphs”  to  our  flow  graphs.  (Both  of  these  approaches  will 
be  described  further  in  the  next  section.) 

Ning’s  PAT  [100,  54]  organizes  its  cliche  library  as  a  hierarchy  of  event  classes.  Each 
instance  of  a  cliche  is  an  object,  which  is  an  instance  of  an  event  cl2iss.  Each  object  is  a 
set  of  attribute-value  pairs,  representing  information  about  an  abstract  cliched  operation. 
They  specify  the  variables  involved  and  lexical  information  (given  in  terms  of  statement  line 
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numbers  and  block  numbers)  describing  the  control  path  leading  to  the  event.  Relationships 
between  program  components,  such  as  calling,  declaration,  and  data  dependencies,  are 
all  encoded  implicitly  in  the  event  object  attributes.  Interval  logic  (which  is  similar  to 
Allen’s  temporal  logic)  is  used  to  derive  these  relationships  during  recognition.  Because 
these  relationships  are  not  made  explicit  in  the  representation,  their  derivation  places  a 
computational  burden  on  the  recognition  process. 

Hartman’s  UNPROG  [55]  uses  a  graphical  representation,  called  a  hierarchical  program 
model,  or  HMODEL,  that  is  roughly  the  dual  of  our  dataflow  graph  representation.  UNPROG 
recognizes  cliched  patterns  of  control  flow,  called  control  concepts,  such  as  “read-process 
loop”,  and  “bounded  linear  search”.  The  HMODEL  representation  consists  of  a  hierarchi¬ 
cally  decomposed  control  flow  graph  and  a  type  of  dataflow  graph.  The  nodes  of  the  control 
flow  graph  are  primitive  actions,  tests,  joins,  or  other  sub-HMODELS  and  its  edges  rep¬ 
resent  the  control  flow  between  them.  The  control  flow  graph  is  hierarchically  partitioned 
by  proper  decomposition,  which  bundles  up  sub-graphs  that  are  single-entry,  single  exit. 
This  static  partitioning  is  performed  before  recognition  is  attempted.  The  dataflow  graph 
represents  definition-use  relations  between  the  variable  names  referred  to  by  the  control 
flow  graph  nodes. 

The  HMODEL  representation  can  be  seen  as  an  encoding  of  plan  diagrams  (see  Section 
4.1.2)  in  a  graph  representation  which  retains  the  control  flow  information  in  the  graph 
structure,  but  which  relegates  the  dataflow  information  to  attributes  (definition-use  rela¬ 
tions).  However,  unlike  plan  diagrams,  HMODEL  does  not  represent  net  dataflow:  the 
definition  and  use  of  variable  names  is  explicitly  captured  and  assignment  is  considered  a 
primitive  action. 

Due  to  its  emphasis  on  control  flow,  the  HMODEL  representation  is  able  to  concisely 
represent  general  control  flow  patterns,  which  are  more  difficult  to  capture  in  our  dataflow 
graphs.  (See  Section  5.2.3.)  On  the  other  hand,  our  dataflow  graphs  concisely  capture 
constraints  on  patterns  of  dataflow  that  must  exist  for  instances  of  algorithmic  and  data 
structure  cliches  to  occur.  The  two  representations  are  complementary.  UNPROG  and 
GRASPR  could  profitably  co-operate  as  co- routines:  UNPROG  could  quickly  provide  coarse- 
grmn  analysis  of  control  patterns,  which  suggest  the  existence  of  certain  algorithmic  cliches, 
while  GRASPR  could  focus  on  a  more  detailed  recognition  of  these  cliches  in  the  parts  of  the 
program  narrowed  down  by  UNPROG. 

7.3.2  Other  Recognition  Techniques 

Besides  representational  differences,  GRASPR  differs  from  other  current  recognition  systems 
in  its  technique  for  performing  recognition.  Existing  recognition  techniques  differ  from  ours 
mainly  in  the  flexibility  of  their  control  strategy,  how  they  use  heuristics,  and  how  much 
knowledge  about  the  purpose  or  goals  of  the  program  they  require  as  input  to  help  guide 
their  search. 
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Our  recognition  architecture  has  a  general,  flexible  control  structure  which  can  accept 
advice  and  guidance  from  external  agents.  Other  existing  recognition  systems  are  committed 
to  a  rigid  (often  ad  hoc)  control  strategy.  Most  search  for  a  single  best  interpretation  of  the 
program,  while  permanently  cutting  oif  alternatives.  This  can  cause  cliches  to  be  missed. 
They  cannot  try  harder  later  to  incrementally  increase  their  power  and  find  cliches  that 
the  heuristic  recognition  missed.  They  also  cannot  generate  multiple  views  of  the  program 
when  desired,  nor  provide  partial  information  when  only  near- misses  of  cliches  are  present. 

In  addition,  many  of  these  systems  have  heuristics  for  controlling  cost  built  in  directly. 
These  are  are  chosen  on  a  trial-and-error  basis.  For  example,  they  often  evolve  through 
experimentation  with  sets  of  student  programs  until  a  good  level  of  performance  is  reached. 
Interesting  future  work  with  GRASPR  will  try  to  formulate  probabilities  of  consistency  for 
constraints  (see  Section  6.2.5),  which  can  be  computed  and  used  to  automatically  tailor 
the  recognition  system  to  check  certain  constraints  before  others.  This  would  dynamically 
prioritize  constraints  based  on  a  given  program  and  library  of  cliches,  rather  than  statically 
prioritizing  them  for  good  performance  over  ‘‘typical”  programs  and  cliches. 

Many  recognition  techniques  also  take  information  about  the  goals  and  purpose  of  the 
program  (in  the  form  of  a  specification  or  model  program).  Some  recognition  systems  can 
accept  and  respond  to  information  from  other  non-recognition  techniques  (e.g.,  a  theorem 
prover  [95]  or  dynamic  analysis  of  program  executions  [85])  with  which  they  are  integrated. 
While  these  techniques  show  the  utUity  of  these  additional  sources  of  information,  they 
rely  on  this  information  being  ^ven  as  input,  rather  than  accepting  it  and  responding  to 
it  if  it  becomes  available.  Most  of  these  systems  have  been  developed  in  the  context  of 
intelligent  tutoring  systems  for  teaching  programming  skills.  In  this  domain,  the  purpose  of 
the  program  being  analyzed  is  very  well-defined.  It  can  be  used  to  provide  reliable  guidance 
to  the  program  recognition  process.  However,  in  many  other  task  applications,  especially 
software  maintenance,  information  about  the  purpose  of  the  program  and  its  design  is  rarely 
complete,  accurate,  or  detailed  enough  to  rely  on  as  required  input. 

Johnson’s  PROUST  [65]  is  a  system  that  analyzes  and  debugs  PASCAL  programs  written 
by  novice  programmers.  It  takes  as  input  a  description  of  the  goals  of  the  program  and 
knowledge  about  how  goals  can  be  decomposed  into  subgoals,  as  well  as  the  relationships 
between  goals  and  the  computational  patterns  (cliches)  that  achieve  them.  Based  on  this 
information,  PROUST  searches  the  space  of  goal  decompositions,  using  heuristics  to  perma¬ 
nently  prune  the  search.  (For  example,  it  uses  heuristics  about  which  goals  and  patterns 
are  likely  to  occur  together.)  PROUST  looks  up  the  typical  patterns  that  implement  the 
goals  and  tries  to  recognize  at  least  one  in  the  code.  The  low  level  patterns  that  actually 
implement  the  goals  are  then  found  by  simple  pattern  matching. 

Ruth’s  system  [122],  like  PROUST,  is  given  a  program  to  analyze  and  a  description  of 
the  task  that  the  program  is  supposed  to  perform.  The  system  matches  the  code  against 
several  implementation  patterns  (clich48)  that  the  system  knows  about  for  performing  the 
task.  Ruth’s  approach  is  similar  to  GRiSPR’s  in  that  the  system  uses  a  grammar  to  describe  a 
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class  of  programs  and  then  tries  to  parse  programs  using  that  grammar.  The  differences  are 
that  Ruth’s  system  makes  use  of  knowledge  about  the  purpose  of  the  program  (in  the  form 
of  a  task  description)  to  narrow  down  its  search  and  the  program  is  analyzed  in  its  textual 
form  and  is  therefore  parsed  as  a  string.  Another  difference  is  that  Ruth’s  system  does  no 
partial  recognition.  The  entire  program  must  be  matched  to  an  algorithm  implementation 
pattern  for  the  analysis  to  work. 

Lukey’s  Program  Understanding  and  Debugging  System  (PUDSY)  [87]  also  takes  as 
input  information  about  the  purpose  of  the  program  it  is  analyzing,  in  the  form  of  a  program 
specification,  which  describes  the  effects  of  the  program.  This  description  is  not  used, 
however,  in  guiding  the  search  for  chches.  Rather,  PUDSY  analyzes  the  program  and  then 
compares  the  results  of  the  analysis  to  the  program  specification.  Any  discrepancy  is  pointed 
out  as  a  bug.  The  analysis  proceeds  as  foUows.  PUDSY  first  uses  heuristics  to  segment  the 
program  into  chunks,  which  are  manageable  units  of  code  (e.g.,  a  loop  is  a  chunk).  It  then 
describes  the  flow  of  information  (or  interface)  between  the  chunks  by  generating  assertions 
about  the  values  of  the  output  variables  of  each  chunk.  These  assertions  are  generated  by 
recognizing  familiar  patterns  of  statements  (called  schema),  similar  to  GRiSPR’s  cliches,  in 
the  chunks.  Associated  with  each  schema  are  assertions  describing  their  known  effects  on 
the  values  of  variables  involved.  For  chunks  that  have  not  been  recognized,  assertions  are 
generated  by  symbolic  evaluation. 

Adam  and  Laurent’s  LAURA  [2]  receives  information  about  the  program  to  be  analyzed 
and  debugged  in  the  form  of  a  model  program,  which  correctly  performs  the  task  that  the 
program  to  be  analyzed  is  supposed  to  accomplish.  LAURA  then  compares  the  graphs  of  the 
two  programs  and  treats  any  mismatches  as  bugs.  Since  nodes  are  really  statements  of  the 
program,  the  graph  matching  is  essentially  statement-to-statement  matching.  The  system 
works  best  for  statements  that  are  algebrauc  expressions  because  they  can  be  normalized 
by  unifying  variable  names,  reducing  sums  and  products,  and  canonicalizing  their  order. 
The  system  heuristically  applies  graph  canonicalizing  transformations  to  try  to  make  the 
program  graph  better  match  the  model  graph.  It  can  find  low-level  and  localized  bugs  by 
identifying  slight  deviations  of  the  program  graph  from  the  model  graph. 

The  system  proposed  by  Fickas  and  Brooks’  [43]  starts  with  a  high-level  cliche  abstractly 
describing  the  purpose  of  the  program.  From  this,  it  hypothesizes  refinements  and  decom¬ 
positions  to  subcliches,  based  on  its  implementation  plans  (analogous  to  overlays  in  the  Plan 
Calculus).  These  hypotheses  are  verified  by  matching  the  code  fragments  of  the  cliches  on 
the  lowest  level  of  the  library  with  the  code.  While  a  hypothesis  is  being  verified,  other 
outstanding  clues  (called  beacons)  may  be  found  that  suggest  the  existence  of  other  cliches. 
This  leads  to  the  creation,  modification,  and  refinement  of  other  hypotheses  about  the  code. 

Murray’s  Talus  system  [95]  is  given  a  student  program  to  be  analyzed  and  debugged,  as 
well  as  a  description  of  the  task  the  program  is  supposed  to  perform.  It  has  a  collection  of 
reference  programs  that  perform  various  tasks  that  may  be  assigned  to  the  student.  The 
task  description  is  used  to  narrow  down  the  reference  programs  that  need  to  be  searched 
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to  find  one  that  best  matches  the  student’s  possibly  buggy  program.  Heuristic  and  formal 
methods  are  interleaved  in  Talus’s  control  structure.  Symbolic  evaluation  and  case  analysis 
methods  detect  bugs  by  pointing  out  mismatches  between  the  reference  program  and  the 
student’s  program.  Heuristics  are  then  used  to  form  conjectures  about  where  bugs  are 
located.  Theorem  proving  is  used  to  verify  or  reject  these  conjectures.  The  virtue  of  this 
approach  is  that  heuristics  are  used  to  pinpoint  relatively  small  parts  of  the  program  where 
some  (expensive)  formal  method  (such  as  theorem  proving)  may  be  applied  effectively. 
However,  the  success  of  the  system  depends  heavily  on  the  heuristics  that  identify  the 
algorithm,  find  localized  dissimilarities  between  the  reference  program  and  the  student’s 
program,  and  map  the  student’s  variables  to  reference  variables. 

Looi’s  APROPOS2  [85]  uses  a  technique  very  close  to  Talus’s.  It  matches  a  Prolog 
program  against  a  set  of  possible  algorithms  for  a  particular  task.  Like  Talus,  it  applies  a 
heuristic  best-first  search  of  the  algorithm  space  to  find  the  best  fit  to  the  code. 

Bertels’  [11]  Camus  performs  recognition  of  programs  for  the  purposes  of  debugging 
student  programs.  It  compares  student  programs  against  a  model  program  as  follows. 
Camus  uses  a  knowledge  base  containing  the  knowledge  necessary  to  analyze  a  program 
that  is  intended  to  solve  the  classic  Noah  Rainfall  Problem  [65].  The  model  and  student 
programs  are  each  analyzed  using  this  knowledge  base.  The  analysis  converts  each  program 
into  a  “High  Level  Description”  (HLD),  contmning  the  conceptual  schemes  that  are  found  in 
the  program.  Camus  first  “augments”  the  programming  primitives  found  in  the  program  by 
classifying  them  in  terms  of  their  role  on  a  slightly  higher  level  of  abstraction  (i.e.,  it  creates 
SAPPs  -  see  Section  7.3.1).  Based  on  these  SAPPs,  conceptual  schemes  are  recognized  in 
a  bottom-up,  heuristic  fashion,  using  beacons  as  guides.  The  two  HLD’s  are  compared 
(currently  by  a  straightforward  manual  process)  and  any  inconsistency  or  incompleteness 
in  the  student  HLD  is  reported  as  a  bug. 

There  are  a  few  other  recognition  techniques  that,  like  GRASPR,  are  purely  code-driven. 
These  will  be  described  in  the  remainder  of  this  section. 

Letovsky’s  CPU  [84]  uses  a  technique  called  transformational  analysis.  It  takes  as  input  a 
lambda  calculus  representation  of  the  source  code  and  a  collection  of  correctness-preserving 
transformations  between  lambda  expressions.  Recognition  is  performed  by  opportunistically 
applying  the  transformations:  when  an  expression  matching  a  standard  plan  (cliche)  is 
recognized,  it  is  rewritten  to  an  expression  of  the  plan’s  goal.  This  is  similar  to  the  parsing 
performed  by  GRASPR,  except  that  CPU  does  not  find  all  possible  analyses.  Rather,  it 
uses  a  simple  recursive  control  structure  in  applying  transformations:  when  more  than  one 
standard  plan  matches  a  piece  of  code,  an  arbitrary  choice  is  made  between  them.  The 
program  is  destructively  reduced  and  the  alternative  is  never  explored  further.  Letovsky 
defines  a  well-formedness  criterion  for  the  library  of  cliched  plans  which  requires  that  no 
plan  be  a  generalization  of  any  other  plan.  If  the  library  is  well-formed,  then  this  arbitrary 
choice  will  not  matter,  since  recognizing  one  plan  will  not  prevent  the  recognition  of  another. 
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However,  this  relies  on  the  fact  that  CPU  performs  a  great  deal  of  copying:  if  two  cliches 
overlap  in  a  program  (e.g.,  as  a  result  of  merging  implementations  as  an  optimization),  their 
common  subparts  are  copied  so  that  each  cliche  can  be  recognized  individually  without 
interfering  with  the  recognition  of  the  other  cliche.  Unfortunately,  this  leads  to  the  problem 
of  severe  “expression  swell.” 

CPU  is  not  able  to  generate  multiple  partial  analyses  of  the  program.  There  are  situ¬ 
ations  in  which  it  is  better  (or  necessary)  to  carry  along  multiple  possible  analyses,  while 
sometimes  it  is  sufficient  to  generate  just  one  analysis.  For  example,  in  verification  applica¬ 
tions,  any  analysis  is  all  that  is  required.  However,  multiple  analyses  are  often  helpful  for 
programs  in  which  there  are  unrecognizable  sections  which  lead  to  several  useful  ways  of 
partially  recognizing  the  program.  Being  able  to  generate  partial  (near-miss)  recognitions 
is  important  in  robustly  dealing  with  buggy  programs  as  well  as  in  eliciting  advice. 

The  value  of  our  flexible  control  strategy  is  that  we  can  tailor  it  to  a  particular  ap¬ 
plication  or  input/output  environment.  GRASPR  can  be  made  to  produce  a  single  analysis, 
by  allowing  each  complete  item  to  extend  at  most  one  partial  item.  Unlike  CPU,  however, 
GRASPR  can  be  made  to  generate  more  recognition  results  by  exploring  alternative  analyses, 
trying  harder  to  find  certain  cliches,  and  responding  to  incremental  changes  in  the  input 
program  that  may  uncover  more  cliches  and  cause  others  to  disappear. 

Laubsch  and  Eisenstadt’s  system  [81,  82]  distinguishes  between  two  types  of  cliches: 
standard  (general  programming  knowledge)  and  domain-specific.  Standard  cliches  are  rec¬ 
ognized  in  the  program’s  plan  diagram  by  nonhierarchical  pattern  matching  (as  opposed  to 
parsing).  Then  the  recognized  cliches  attach  effect  descriptions  to  the  code  in  which  they  are 
found.  Symbolic-evaluation  of  the  program’s  plan  diagram  computes  the  effect-description 
associated  with  the  entire  program.  Domain-specific  library  cliches  are  recognized  by  com¬ 
paring  the  program’s  effect  description  to  the  effect  descriptions  of  cliches  in  the  library. 
This  transforms  the  problem  of  program  recognition  into  the  problem  of  determining  the 
equivalences  of  formulas.  For  the  examples  given,  effect-descriptions  are  simple  expressions. 
However,  in  generaJ,  proving  the  equivalence  of  formulas  is  extremely  hard. 

Lutz  [88,  89]  has  developed  his  flowgraph  parsing  algorithm  as  a  general  tool  for  use 
in  artificial  intelligence.  He  proposes  some  applications  which  include  program  recognition. 
The  examples  he  sketches  use  flowgraphs  to  represent  plan  diagrams,  such  as  the  one  shown 
in  Figure  4-6.  He  proposes  using  a  program  recognition  process  similar  to  GRASPR’s.  In 
addition,  his  system  will  use  symbolic  evaluation  to  deal  with  unrecognizable  code.  Our 
graph  parsing  algorithm  evolved  from  the  graph  parsing  algorithm  Lutz  developed  [90]  for 
this  purpose.  Our  algorithm  extends  Lutz’s  to  handle  data  aggregation. 

Ning’s  PAT  [54,  100]  uses  basically  a  bottom-up  parsing  approach,  though  not  within  a 
formal  parsing  framework.  PAT  uses  a  rule-based  inference  engine  to  recognize  cliches  (i.e., 
derive  high-level  program  concepts,  or  events,  from  lower-level  ones).  Each  rule  consists 
of  a  trigger  pattern  of  program  events,  which  specifies  the  events  (operations  and  data 
types)  composing  a  clichd  and  how  they  are  related  by  various  types  of  dependencies  and 


252 


lexical  relationships.  The  action  of  the  rule  is  an  assertion  that  a  particular  higher-level 
event  (cliche)  exists  in  the  program  at  a  particular  location.  PAT  can  recognize  overlapping 
as  well  as  delocalized  cliches  and  it  can  do  partial  recognition.  Its  rules  also  distinguish 
some  events  within  patterns  as  “key”  events,  like  beacons,  that  are  searched  for  first.  This 
helps  to  reduce  the  search.  This  is  similar  to  specifying  a  node  ordering  in  our  graph 
grammar  rules.  The  main  difference  between  PAT’s  recognition  architecture  and  GRASPR's 
chart-parser-based  architecture  is  in  GRASPR’s  flexibility  of  control.  GRASPR  has  explicit  data- 
directed  mechanisms  for  guiding  and  advising  the  recognition  process. 

Hartman’s  UNPROG  [55]  performs  a  type  of  recognition  that  is  complementary  to  ours. 
Hartman  has  identified  a  restricted  class  of  cliches,  called  control  concepts,  that  can  be 
recognized  efficiently.  As  mentioned  earlier,  UNPROG  hierarchically  models  the  program’s 
flow  of  control  by  performing  a  proper  decomposition  on  the  program’s  control  flow  graph. 
Recognition  is  then  performed  by  simple  exact  graph  matching.  This  takes  advantage  of 
the  fact  that  typically  the  implementations  of  control  concepts  are  not  interleaved  with  each 
other  or  with  unrecognizable  code  within  propers. 

The  difference  between  this  technique  and  our  parsing  technique  is  that  UNPROG’s  de¬ 
composition  of  the  program  is  static  and  independent  of  the  matching,  while  in  parsing,  the 
decomposition  is  dynamically  driven  by  what  is  matched.  The  static,  a  priori  decomposition 
yields  efficiency  and  scalability  advantages.  The  search  is  reduced  because  control  concepts 
are  localized  within  propers.  There  is  no  need  to  generate  aU  partial  matches  of  propers. 
There  is  no  ambiguity  about  how  to  match  inputs  and  outputs  of  cliched  control  concept 
implementations  to  those  of  a  proper,  since  all  propers  have  one  input  and  one  output. 
Hartman’s  research  shows  the  benefits  of  good  decomposition  techniques. 

This  technique  works  well  for  control  concept  recognition.  However,  in  genersd,  the 
danger  of  decomposing  the  program  representation  and  then  looking  for  particular  cliches 
only  within  the  partitions  is  that  a  cliche  might  be  missed  if  it  is  not  contained  within 
some  partition  boundary.  This  technique  works  best  if  there  are  standard  decompositions 
of  cliches  and  the  cliches  appear  in  programs  in  these  same  organizations.  Future  research 
should  look  for  other  classes  of  cliches  like  control  concepts  and  for  methods  of  decomposition 
that  allow  them  to  be  recognized  efficiently. 

One  way  GRASPR  can  benefit  from  the  efficiency  of  a  priori  decomposition  without  sac¬ 
rificing  completeness  is  to  use  some  sort  of  decomposition,  such  as  subroutinization,  or 
bundles  of  slices  all  contributing  to  the  same  user-defined,  aggregate  data  structure  to  do 
an  initial,  quick  recognition.  Then  “try-harder”  later  by  looking  for  cliches  that  might  cross 
the  boundaries,  e.g.,  in  areas  where  no  cliche  was  recognized  or  by  extending  partial  items 
that  are  near-misses  or  have  salient  parts  matched  already.  Section  6.4.1  discussed  some  of 
these  ideas. 

A  novel  type  of  recognition  is  being  pursued  by  Soni  [129,  130]  as  part  of  the  develop¬ 
ment  of  a  Msdntalner’s  Assistant.  This  system  will  focus  on  recognizing  guidelines  which 
constrain  the  design  components  of  a  program  and  embody  global  interactions  between 
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the  components.  For  example,  guidelines  express  relations  between  the  slots  of  data  struc¬ 
tures  and  constraints  on  how  they  may  be  accessed  or  updated.  This  type  of  recognition  is 
orthogonal  to  the  recognition  of  cliches  reported  in  this  paper. 

A  completely  different  approach  to  recognition  was  proposed  by  BiggerstafF  [12,  13]. 
A  central  part  of  his  recognition  system  is  a  rich  domain  model.  This  model  contains 
machine-processable  forms  of  design  expectations  for  a  particular  domain,  as  well  as  infor¬ 
mal  semantic  concepts.  It  includes  typical  module  structures  and  the  typical  terminology 
associated  with  programs  in  a  particular  problem  domain.  The  goal  of  the  recognition  is  to 
link  these  conceptual  structures  to  parts  of  the  program,  based  on  the  correlation  (experi- 
entally  acquired)  between  the  structures  and  the  mnemonic  procedure  and  variable  names 
used  and  the  words  used  in  the  program’s  comments.  A  grep-like  pattern  recognition  is 
performed  on  the  program’s  text  (including  its  comments)  to  cluster  together  parts  of  the 
program  that  are  statistically  related.  (The  Unix  tool  grep  searches  files  for  given  regular 
expressions.) 

The  virtue  of  this  type  of  recognition  is  that  it  quickly  directs  the  user’s  attention  to 
sections  of  the  program  where  there  may  be  computational  entities  related  to  a  particular 
concept  in  the  domain.  While  this  technique  cannot  be  extended  to  provide  a  deeper 
understanding,  it  provides  a  way  of  focusing  the  search  of  other  more  formal  and  complete 
recognition  approaches,  such  as  GRASPR’s.  Like  Soni’s  recognition,  it  is  orthogonal  and 
complementary  to  the  recognition  of  cliches  reported  here. 


7.4  Applications 

Being  able  to  automatically  recognize  existing  code  has  applications  in  many  areas  of  soft¬ 
ware  development  and  maintenance,  including  software  reuse,  verification,  debugging,  op¬ 
timization,  program  translation,  and  documentation.  The  ability  to  recognize  cliches  in  a 
broad  range  of  programs  is  also  useful  for  computer-aided  instruction  of  programmers.  See 
Wills  [144,  145]  and  Hartman  [55]  for  discussions  of  these  applications. 

Two  other  applications  of  our  flow  graph  formalism  and  parser,  not  related  to  program¬ 
ming,  are  automatic  circuit  verification  and  plan  recognition.  Circuit  verification  has  been 
cast  as  a  graph  matching  problem,  with  much  work  focusing  on  heuristic  techniques  for 
solving  graph  isomorphism  [22,  108].  More  recently,  Bamji  [8,  9]  has  shown  how  graph 
parsing  can  be  applied  to  this  problem.  This  gains  the  advantage  of  being  able  to  encode 
an  entire  design  methodology  into  a  design  grammar,  so  that  a  circuit  can  be  verified  with 
respect  to  a  class  of  correct  circuits,  not  just  one.  Our  parsing  algorithm  is  applicable  in 
this  area. 

Plan  recognition  shares  several  difficulties  with  program  recognition,  such  as  dealing 
with  variation  due  to  loose  temporal  ordering  constraints,  interleaved  steps,  and  shared 
steps  among  plans.  Graphical  nonlinear  plan  representations  are  amenable  to  the  graph 
parsing  technique  we  used  to  solve  these  problems  in  program  recognition. 
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Appendix  A 

Flow  Graph  Recognition  is 
NP-  Complete 


Barton,  Berwick,  and  Ristad  ([10],  Chapter  7)  give  a  clever  reduction  of  the  vertex  cover 
problem  to  the  problem  of  recognizing  sentences  according  to  an  unordered  context-free 
grammar  (UCFG).  A  UCFG  is  a  context-free  string  grammar  in  which  the  symbols  in  a  right- 
hand  side  string  are  considered  unordered.  (So,  for  example,  given  a  UCFG  containing  the 
rule  5  — ►  xyz,  S  can  be  recognized  in  the  strings  xyz,  yxz,  zyx,  etc.) 

Our  flow  graph  parsing  algorithm  can  be  used  to  perform  UCFG  parsing  (and  the  simpler 
recognition  problem)  on  a  special  class  of  UCFGs,  which  I  will  call  “fixed-UCFGs.”  Furthermore, 
the  same  reduction  proof  given  by  Barton,  et  al.  can  be  used  to  prove  that  the  flxed-UCFG 
recognition  problem  is  NP-complete.  This  can  be  used  to  show  that  flow  graph  recognition 
is  NP-complete. 

The  class  of  fixed-UCFGs  is  the  class  in  which  each  non-terminal  derives  strings  of  a  fixed 
length  A,  where  k  can  be  different  for  different  non-terminals.  For  example,  this  grammar 

S  ->  A  B  I  C  D  E 
A  ->  a  I  X 
B  ->  b  y  I  w  2 
C  ->  c 
D  ->  d  I  f 
E  ->  e  I  g  I  h 

is  a  fixed-UCFG.  S  only  derives  strings  of  length  three  (such  as  avz  or  cfh),  B  only  derives 
strings  of  length  two,  the  rest  of  the  non-terminals  all  derive  strings  of  length  one.  This 
grammar 

S  ->  A  B 

A  ->  a  z  I  z  7  z 
B  ->  b 
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is  not  a  fixed-UCFG,  since  A  can  derive  two  different  length  strings. 

The  gramnaar  constructed  in  Barton,  et  al.’s  NP-completeness  proof  to  encode  the  vertex 
cover  existence  question  is  always  a  fixed-UCFG.  So,  the  same  construction  can  be  used  to 
reduce  the  vertex  cover  problem  to  the  fixed-UCFG  recognition  problem  in  polynomial-time. 

We  reduce  the  fixed-UCFG  recognition  problem  to  flow  graph  recognition  as  follows.  For 
each  non-terminal,  we  first  compute  the  length  k  of  the  strings  it  derives.  This  can  be  done 
by  imposing  a  partial  ordering  on  the  non-terminals,  where  non-terminal  A  <  non-terminal 
B  if  A  appears  on  B’s  right-hand  side.*  Then  the  fc’s  can  be  computed  bottom-up  through 
the  partial  ordering  from  the  non-terminals  that  have  only  terminals  on  at  least  one  of  their 
rules’  right-hand  sides. 

Next,  for  each  rule  in  the  fixed-UCFG,  A  —*  xiX2X3...Xn,  deriv.  ug  strings  of  length  k,  we 
create  a  graph  grammar  rule  with 

1.  a  left-hand  side  node  of  type  A  having  k  inputs  and  k  outputs, 

2.  a  right-hand  side  flow  graph  containing  n  nodes,  where  the  t-th  node  has  type  x,  and 
each  terminal  node  has  a  single  input  and  a  single  output,  while  each  non-terminal 
node  has  j  inputs  and  j  outputs,  where  j  equals  the  length  of  strings  derived  by  that 
non-terminal,  and 

3.  the  rule  embedding  function  maps  the  i-th  input  (resp.  output)  of  A  to  the  i-th  input 
(resp.  output)  of  the  right-hand  side  graph.  (None  of  the  right-hand  sides  have  edges 
between  ports.) 

Finally,  the  input  string  is  translated  into  a  flow  graph  by  creating  a  node  for  each 
symbol,  with  the  type  of  the  node  being  the  symbol  type.  Each  node  has  one  input  and 
one  output.  There  are  no  edges  between  ports. 

For  example.  Figures  A- la  and  b  show  a  fixed-UCFG  and  the  graph  grammar  into  which 
it  would  be  translated.  Figure  A-lc  shows  how  the  input  string  is  translated  into  a  flow 
graph. 

Now,  we  can  decide  whether  a  particular  input  sentence  is  in  the  language  generated  by 
the  fixed-UCFG  simply  by  determining  whether  the  flow  graph  is  in  the  language  generated 
by  the  flow  graph  grammar  encoding  of  the  fixed-UCFG.  The  flow  graph  is  in  the  language 
of  the  flow  graph  grammar  iff  the  input  sentence  is  in  the  fixed-UCFG’s  language. 

Since  the  NP-complete  problem  of  fixed-UCFG  recognition  can  be  reduced  to  flow  graph 
recognition,  the  flow  graph  recognition  problem  is  also  NP-complete. 

Note  that  the  type  of  flow  graph  recognition  that  we  are  showing  to  be  NP-complete  is 
simpler  than  the  flow  graph  parsing  problem.  This  in  turn  is  even  simpler  than  the  subgraph 
parsing  problem  in  which  program  recognition  is  cast.  This  means  that  even  if  we  were  just 

‘Cycles  in  the  grammar  can  be  handled,  but  I  do  not  describe  how  here.  Alternatively,  we  can  do  this 
NP-completeness  proof  with  acyclic  fixed-UCFGs. 
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b)  Gr^h  grammar  that  the  UCFG  above  is  translated  into. 


c)  An  input  string.  Tbe  flow  grrqrb  it  is  translated  into. 

Figure  A-1:  Reducing  fixed- UCFG  recognition  to  flow  graph  recognition. 
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trying  to  recognize  an  entire  program  as  a  single  cliche  and  even  if  we  did  not  need  to  deal 
with  fan-in  or  fan-out,  we  can  still  encounter  exponential  behavior. 

Readers  familiar  with  Brotsky’s  algorithm  might  contrast  flow  graph  parsing  (not  sub¬ 
graph  parsing  and  not  dealing  with  fan-in  or  fan-out  or  aggregation)  with  the  parsing 
Brotsky’s  algorithm  does  in  polynomial  time.  The  same  types  of  flow  graphs  are  parsed, 
using  the  same  types  of  flow  graph  grammars;  no  extension  to  the  flow  graph  formalism  is 
necessary.  The  crucial  distinction  is  that  Brotsky’s  parser  takes  an  additional  input  besides 
the  input  flow  graph  and  the  flow  graph  grammar,  which  is  a  speciflcation  of  how  the  inputs 
of  the  input  graph  match  to  the  inputs  of  the  start  type  of  the  grammar.  This  information 
is  used  to  predict  the  start  type  at  a  particular  location  (i.e.,  a  particular  matching  of  inputs 
of  the  input  graph  to  inputs  of  the  start  type).  Our  parser,  on  the  other  hand  must  figure 
out  all  the  possible  locations  at  which  a  non-terminal  can  be  found.  This  increases  the 
computational  complexity  of  the  problem. 
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Appendix  B 

The  Example  Programs 


This  appendix  contains  the  original  PiSim  and  CST  source  code,  as  well  as  their  functional 
versions.  Section  5.2.5  lists  the  changes  made  in  translating  between  the  original  and 
functional  versions.  The  original  PiSia  code  is  listed  on  pages  260  to  265.  Its  functional 
version  is  found  on  pages  266  to  274.  The  original  CST  code  is  on  pages  275  to  280  and  its 
functional  version  is  on  pages  281  to  288. 
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;;;  Synt«x:Comon-Li^;  Mod«:LISP;  Bas«:10;  Packag*:U8ER 


;;  Pi  Sinulator  original  varsion 
(in-packag*  'us«r) 

(proclaim  '  (opcimiza  (coaqpilation-^aad  0)  (aafaty  3)  (apaad  3))) 


Global  variablaa 

(dafconstant  *Machina*Diaanaiona*  '(4  4  4) 

*thi8  is  tha  machina  dimansions'] 

(dafvar  *Evant~Quaua*  nil 

■this  is  tha  global  avant  quaua'j 

(dafvar  'Nodas*  nil 

'this  is  tha  noda  array*] 

(dafvar  *Global-Bindings*  (Make-Hash^Tabla) 

«thase  ara  tha  bindings  for  nodals,  constants,  ate.*) 

(dafvar  'Nodal-count*  0 

'‘Hits  is  tha  number  of  defined  nodals*] 

(dafvar  *Dabug-Laval*  0 

'this  is  the  debugging  level*) 

(dafvar  'Log*  nil 

'this  is  the  logging  information') 

; ;  Structures 

(defstruct  Noda 
(Tima  0) 

(ID  0) 

(Segments  (Make-Hash-Tabla] ) 

(Nodals  nil)) 

(defstruct  Segment 
(IVP*  nil) 

(Data  nil) 

(Size  0) ) 

(defstruct  Task 
(Handler  nil) 

(Noda  nil) 

(segment  nil) 

(IP  0) 

(Status  'New)] 

(defstruct  Massage 
(Destination  nil) 

(Length  0) 

(Type  nil) 

(Argusiants  nil] ) 

(defstruct  Event 
(Tima  0) 

(Object  nil) ) 

(defstruct  Handler 
(Name  nil] 

(Instructions  nil] 

(Arity  0) 

(Numbar-Of-Locals  0) 

(Bindings  (Make-Hash-Tabla) ) ) 

(defstruct  O-Sync 

(Suspended-Tasks  nil]) 

(defstruct  B-sync 
(Count  0] 

(Suspended-Tasks  nil)) 

(defstruct  Log 
(Type  'AID 

(Task-Status-Profile  (Make-Hash-Table) ) 

(Task-Type-Profile  (Make-Hash-Tabla)) 
(Instruction-IVpe-Profile  (Make-Hash-Table] ) 
(Operation-iype-Profile  (Make-Hash-Table) ) 

(Concurrency-List  nil) 

(Old-Logs  nil] ] 

(defstruct  Delta 
(Time  0) 

(Value  0)) 

; ; ssvsssssvsssssKssssssaBSSssBvaBBsasaaasassssKsasssssssssssss* 

;  Nodes 


;;  Th&a  translates  a  node  ID  to  a  node. 

(defun  Translate-Node  (Node-ID) 

(aref  'Nodes*  Node-ID) ) 

;;  This  function  returns  the  number  of  nodes. 

(defun  Number-Of-Nodea  (J 
(array-total-siza  'Nodes')) 

;;  This  function  creates  the  node  array  according  to  the  dimension 
; ;  constant . 

(defun  Make-Nodes  () 

(loop  with  Number -Of -Nodes  s  (apply  •'*  'Machine-Dimensions') 
with  Nodes  s  (make-array  Number-Of -Nodes) 
for  ID  from  0  below  Number-Of -Nodes 
for  Node  »  (Make-Node  :1D  ID) 

for  Nodals-Segment  k  (Create-Read-Write-Segment  100} 
do  (setf  (aref  Nodes  ID)  Node) 
do  (setf  (Node-Nodals  Node) 

(Adt^-Segment  Nodals-Segment  Node) ) 
finally  (setg  'Nodes*  Nodes])) 

;;  This  function  resets  the  node  time  and  clears  the  node  segment, 
(defun  Clear-Nodes  () 

(loop  for  Node  being  the  array-elements  of  'Nodes' 
for  Nodals-ID  s  (Node-Nodals  Node) 

for  Nodals  s  (Translate-Segment -On-Node  Nodals-ID  Node) 

doing  (setf  (Node-Time  Node)  0) 

doing  (Clear-Hash-Table  (Node-Segments  Node] ] 

doing  (Hash-Insert  (Node -Segments  Mode)  Nodals-ID  Nodals) 

doing 

(lo^  with  Data  =  (Segment-Data  Nodals) 

for  Index  from  0  below  (array-total-size  Data) 
doing  (setf  (aref  Data  Index)  'Unbound)))) 


; ;  segments 

;;  *11118  adds  a  segment  to  the  node's  segment  translations.  It 
;;  returns  the  unique  segment  ID. 

(defun  Add-Segment  (Segment  Node) 

(let  ((Segment-ID  (gensym  'Segment-'))) 

(Hash-Insert  (Node-Segments  Mode] 

Segment-ID 

Segment) 

Segment-ID) ) 

;;  This  removes  a  segment  ID  from  the  node's  segment  translations. 

(defun  Delete-Segment  (Segment-20  Node) 

(Hash-Delete  (Node-Segments  Node) 

Segment-ID) ) 

;;  This  translates  a  segment  ID  to  a  segment  on  the  specified 
;;  task's  node. 

(defun  Translate-Segment  (Segment-lD  Task) 
(Translate-Segment-On-Node  Segment-ID 

(Task-Node  Task))) 

;;  This  translates  a  segment  ID  on  a  q^cified  node. 

(defun  Translate-Segment -On-Node  (Segment-ID  Node) 

(let  ( (Segment  (Hash-Lookup  (Node-Segments  Node) 

Segment-ID) ) ) 

(if  (null  Segment) 

(break  *PiSim  error:  missing  segment*) 

SegMnt) ) ) 

;;  Hits  function  creates  a  read-write  segment. 

(defun  Create-Read-Hrite-Segment  (Size) 

(Make-Segment  :8ize  Size 

:1VP*  'Read-Hrite 
:Data  (make-array  Size))) 

;;  Hiis  function  creates  an  associative  set  segment. 

(defun  Create-Associative-Set -Segment  (Size) 

(Make-Segment  :Size  Size 

:Type  'Associative-Set 
:Data  (Make-Hash-Table  Size))) 

;;  This  function  creates  a  cache  segment. 

(defun  Create-Cache-Segment  (Size) 

(Make-Segment  :Sise  Size 

:lVpe  'Cache 

:Data  (make-array  Size) ) ) 
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(dftfun  (Key 

(l«t*  ((Znd«x  (CadM'HAsh  K«y  (Sagaant^Sisa  Sagaant))) 
(tntry  (araf  (sagaant^Oata  Sagaant)  Jndax}}) 

(if  (and  (not  (aqual  Entry  'toipty)) 

(aqual  (first  Entry)  Eay) ) 

(rast  Entry) 

'Nias) ) ) 


;;  This  function  roads  a  raad-writa  sagaant. 

(dafun  Raad'Sagaant  (Sagaant  Offsat) 

(unlass  (aqual  ( Sagaant -Typa  Sagaant) 

'Raad-Nrita) 

(braa)(  *PiSia  arror:  incorract  accass  oparation  for  - 
sagaant  typa*)) 

(araf  (Sagaant -Data  Sagaant)  Offsat)) 

;;  This  function  writes  a  raad-writa  sagaant. 

(dafun  writa-Sagaant  (Sagaant  offsat  Naw-Valua) 

(unlass  (aqual  (Sagaant -Typa  Sagaant) 

'Raad-Writa) 

(braalc  'Pisia  arror:  incorract  accass  ^^ration  for  - 
sagaant  typa*)) 

(satf  (araf  (Sagaant -Data  Sagaant)  Offsat) 

Naw-Valua) ) 

;;  This  function  attMpts  to  natch  a  Icay  in  an  associativa  sat 
;;  or  cacha  sagaant. 

(dafun  Natch-Sagaant  (Sagaant  Kay) 

(case  (Sagaant -Typa  Sagaant) 

(Associativa-Sat 

(Hash-Loolciqp  (Sagaant -Data  sagaant)  Kay)) 

(cache 

(Natch-Cacha  Ray  Sagaant)) 

(otharwisa 

(braa)i  *PiSia  arror:  incorract  accass  ^>aration  for  - 
sagaant  typa*) ) ) ) 

;;  This  function  inserts  a  key  in  an  associativa  sat  or  cacha 
;  sagaant . 

(dafun  Insart-Sagaant  (Sagaant  Kay  Naw-Valua) 

(case  (Sagaant -Type  Sagaant) 

(Associativa-Sat 

(Hash-Insert  (Sagaant -Data  Sagaant) 

Kay 

Naw-Valua)) 

(Cacha 

(jnsart -Cacha  Kay  Sagaant  Naw-Valua)) 

(otherwise 

(break  *PiSia  arror:  incorract  accass  ^oration  for  -> 
sagaant  type*)))) 

; i  This  function  raaovas  a  key  froa  an  associativa  sat  or  cacha 
;;  sagaant. 

(dafun  Raaova-Kay-Sagaant  (Sagaant  Kay) 

(case  (Sagaant -Typa  Sagaant) 

(Associativa-Sat 

(Hash-Dalata  (Sagaant -Data  Sagaant)  Kay) ) 

(Cacha 

(Raaova-Kay-Cacha  Ray  Sagaant)) 

(othaivrisa 

(break  "Pisia  arror:  incorract  accass  oparation  for  <- 
sagaant  type*)))) 

;;  This  fimction  clears  an  associativa  sat  or  cacha  sagaant. 

(dafun  Claar-Sagaant  (Sagaant) 

(case  (Sagaant -Type  Sagaant) 

(Associativa-Sat 

(Claar-Hash-Tabla  (Sagaant -Data  Sagaant))} 

(Cacha 

(Claar-Cacha  Sagaant)) 

(otherwise 

(break  *PiSiB  arror:  incorract  access  oparation  for  - 
sagaant  type*)))) 


Caches 

;;  In  PiSia,  caches  are  iaplaasntad  as  direct  aappad  arrays.  A 
;;  hash  function  coaputas  an  index  into  an  array.  Array  entries 
;;  are  cons  calls  are  of  the  foraat:  (Kay  .  Value). 

;;  This  is  the  hash  function  for  caches. 

(dafun  Cache-Hash  (Ray  Size) 

(whan  (nuabarp  Ray) 

(satq  Kay  (foraat  nil  *-a*  Ra/))) 

(loop  with  String  ■  (string  Kay) 

for  Character  being  the  array-alaaants  of  String 
suaaing  (char-int  Character) 
into  Value 

finally  (return  (aod  value  Size) )) ) 

;;  This  function  attaapts  to  aatch  a  key  in  a  hash  table. 

;;  If  the  key  is  found,  the  corraipondlng  value  is  ratumad. 

;;  Otharwisa,  'Miss  is  ratumad. 


;;  This  function  writes  an  entry  in  the  cache,  possible  overwriting 
;;  another  value. 


;;  This  function  raaovas  a  key  froa  a  cache.  If  the  key  is  not  present, 
;;  no  action  is  taken. 


Tasks 


;;  This  returns  the  tiaa  of  a  task.  This  is  defined  as  the  node 
;;  tiaa  for  the  pacified  task. 


;  This  function  creates  a  new  task  sagaant  of  the  pacified  length. 

;  The  nuabar  of  arguaants  and  aassaga  length  values  are  coi^rad  with 
;  the  handler  arity  and  arity  plus  nuabar  of  locals  ra^activaly.  Two 
;  is  addsd  to  the  arity  and  nuabar  of  locals  to  account  for  the  aassaga 
;  length  and  typa  information  stored*  in  the  sagaant.  The  sagaant  is 
;  than  initializes  with  the  supplied  arguaants. 


;;  This  function  creates  a  new  task  for  a  aassaga.  The  handler  and 
;;  nods  are  datarainad.  A  new  sagaant  is  crest  T  and  initialized. 

;;  After  the  new  tank  is  created,  its  sagaant  is  added  to  the  task's 
;;  node.  Finally  the  new  task  is  ratumad. 


(dafun  Insert -Cacha  (Kay  Sagaant  Naw-Valua) 

(satf  (araf  (Cagaant-Data  Sagaant) 

(Cache-Hash  Kay  (Sagaant -Size  Sagaant})) 
(cons  Ray  New-Valua) ) ) 


(dafun  Raaova-Kay-Cacha  (Kay  Sagaant) 

(let*  ((Index  (Cache-Hash  Ray  (Sagaant -size  Sagaant) ) ) 
(Entry  (araf  (Sagaant -Data  Sagaant)  Index))) 
(whan  (and  (not  (aqual  Entry  'ttpty)) 

(aqual  (first  Entry)  R^) ) 

(satf  (araf  (Sagaant -Data  Sagaant)  Index) 

'  Espty) ) ) ) 

;;  This  function  clears  a  cache. 

(dafun  Claar-Cacha  (sagaant) 

(lo^  with  Data  s  (Sagaant-Data  Sagaant) 

for  Index  froa  0  below  (array7total-siza  Data) 
doing  (satf  (araf  Data  Index)  'bipty))) 


(dafun  Tiaa-Of  (Task) 

(Noda-Tiaa  (Task-Nods  Task) ) ) 

;;  This  sets  the  tins  of  the  specified  task  (i.a.  the  tiaa  of 

;;  the  node  of  the  specified  task). 

(dafun  Sat -Tiaa-Of  (Task  Nsw-Tiaa) 

(satf  (Noda-Tiaa  (Task-Node  Task) ) 

Naw-Tiaa) ) 

;;  This  incraaants  the  task  tiaa  by  the  q>eeifiad  delta. 

(dafun  Incraaant -Tiaa-Of  (Task  Delta) 

(incf  (Noda-Tiaa  (Task-Node  Task) ) 

Delta)) 

;;  This  returns  the  handler  typa  of  the  task. 

(dafun  Handlar-Naaa-Of  (Task) 

(Handlar-Naaa  (Task-Handler  Task) ) ) 


;;  This  returns  the  node  ID  of  the  pacified  task's  nodes. 

(dafun  Noda-Of  (Task) 

(Noda-ID  (Task-Node  Task) ) ) 


(dafun  Craata-Task-Sagaant  (Length  Task-Type  Arguaants  Handler) 
(let  ( (Naw-sagaant  (Craata-Raad-Mrita-Sagaant  Length))) 

(«dian  (not  (b  (Handlar-Arity  Handler) 

(length  Arguments) ) ) 

(break  *PiSiB  arror:  arity  aioMteh*)) 

(whan  (not  (■  Length  (♦  (Handlar-Arity  Handler) 

(Handlar-Nuzter-Of-Locals  Handler) 
2))) 

(break  'PiSia  arror:  length/  handler  storage  aiaaateh*}) 
(Writa-Sagaant  Naw-Sagaant  0  Length) 

(Writa-Sagaant  Naw-Sagaant  1  Task-TVpa) 

(loop  for  Argument  in  Arguments 
for  Index  from  2 

doing  (Writa-Sagaant  Naw-Sagmant  Index  Argument)) 
Naw-Sagmant) ) 
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(d«fun  Cr««t«*Tftsk  (N*ss«gtt) 

(l«c*  ((H«fidl«r  (G«t -Handler  (Naasaga-Typa  Maasag*) ) ) 

(Noda  (Tranalata-Noda  (Maaaaga-Oaatination  Naaaaga))) 
(Naw'Sagaiant  (Craata-Taak-Sagnant 

(Maaaaga-Langth  Naaaaga) 

(Maaaaga'Typa  Maaaaga) 

(liaaaaga*Argu»anta  Maaaaga) 

Handlar) ) 

(Naw-SagMnt*XD  (Add-Sagaant  Naw-Sagaant  Noda)) 

(Naw-Taak  (Naka-Taak  :Handlar  Handlar 
:Noda  Noda 

:Sagaanc  Naw>8agaant-ID) ) ) 

Naw-Taak) ) 

;;  Thia  function  axacutaa  a  taak.  It  axacutaa  inatructiona  i«hich 
;;  changa  a  taak'a  atatua.  If  tha  atatua  ia  'Running^  anothar 
; ;  inatruction  ia  axacutad. 

(dafun  Exacuta-Taak  (Taak) 

(loop  doing  (Bxacuta-Naxt '•Inatruction  Taak) 

whila  (aqual  (Taak-Statua  Taak)  'Running))) 

; ;  Evanta 

;;  itiia  function  anquauaa  an  avant  in  tha  global  avant  guaua. 

;;  Evanta  ara  anquauad  in  ordar  on  incraaaing  avant  tina. 

;;  **  Note  that  whan  2  evanta  have  tha  aaaM  tisa,  tha  one  aant 
to  Enquaua-Bvant  firat  haa  higher  priority. 

(dafun  Enquaua-Evant  (Naw-BvantJ 
(if  (or  (null  *Bvant -Queue*) 

(<  (Bvant-Tina  Naw-Evant) 

(Evant-Tina  ( firat  * Event -Queue*) ) ) ) 

(puah  Naw-Bvant  *Bvant -Queue*) 

(Inaart-Evant  Naw-Bvant  *Bvant -Queue*) ) ) 

;;  Thia  function  ia  uaad  to  enqueue  evanta  inaida  tha  avant  queue. 
;;  It  ia  part  of  a  racuraiva^  priority  queue  insert  algorithm. 

(defun  Inaart-Evant  (New-Bvent  Event-Queue) 

(if  (or  (null  (rest  Event-Queue)) 

(<  (Event-Time  New-Bvent) 

(Event-Time  (second  Event -Queue) )) ) 

(puah  New-Bvent  (rest  Bvant-Quaua) ) 

(Inaart-Evant  Naw-Evant  (rest  Bvant-Quaua)))) 

;;  Thia  function  dequeues  and  returns  a  avant  from  tha  global 
;;  avant -queue.  If  tha  queue  ia  empty/  nil  ia  returned. 

(dafun  Oaquaua-Evant  () 

(pop  *Evant -Queue*) ) 

;;  ‘niia  function  clears  tha  avant  queue. 

(dafun  Claar-Bvant -Queue  (} 

(aetq  *Bvant -Queue*  nil)) 

;;  Thia  function  dequeues  and  axacutaa  the  next  avant  in  tha  avant 
;;  queue.  If  tha  avant  ia  a  massage/  a  new  task  la  created.  Hie 
;;  noda  time  ia  adjusted  if  tha  event  time  ia  later  than  noda 
; ;  time.  If  a  avant  ia  executed/  t  ia  returned. 

(dafun  Execute-Next -Event  () 

(let*  ((Event  (Daquaua-Evant) ) 

Taak) 

(aetq  Taak  (Creata-Taak  (Event -t^jact  Event))) 

(Sat-Tima-Of  Taak 

(if  (>  (Event-Time  Event) 

(Tlme-Of  Taak) ) 

(Event-Time  Event) 

(Tima-Of  Task)}) 

(Debug-Print  1 

'(start!  taak  -a  noda  -d  time  -d  old  status  -a]-R* 
(Handlar-Nama-of  Taak)  (Node-Of  Taak) 

(Tima-Of  Taak)  (Taak-Etatua  Taak) ) 

(Log-Task  '^aak) 

(aetf  (Taak-8tatua  Taak)  'Running) 

(Adjuat-Concurrancy-Liat  (Tima-of  Taak)  1) 

(Exacuta-Taak  Taak) 

(Adjust -Concurrency-List  (Tima-of  Taak)  -1) 

(Debug-Print  1 

'(atop:  taak  -a  noda  -d  time  -d  atatua  -al-R* 
(Handlar-Nama-Of  Task)  (Noda-Of  Taak) 

(Tima-of  Taak)  (Taak-Statua  Task) ) ) ) 

;  sssssaxssssssssassSBBSSsaesssssaasssBsasasssaBssssassssssKsskss 

Handlers 

;;  This  predicate  testa  if  a  statement  ia  an  instruction. 

(dafun  Label?  (Statement) 

(aymbolp  Statement)) 


;;  This  predicate  testa  if  a  atatement  is  an  instruction. 

(dafun  Inatruction?  (StatMsnt) 

(liatp  statMiant)  ] 

Thia  function  inserts  a  binding  into  a  handler's  bindings.  If  the 
;;  specified  handlar  la  'Global/  tha  binding  is  inserted  in  the  global 
bindings. 

(dafun  Insert -Binding  (NaiM  Value  Handlar) 

(if  (equal  Handler  'Global)  * 

(Haah-Inaart  'Global-Bindings*  Name  value) 

(Haah-lnaart  (Handler-Bindings  Handlar)  Name  Value))) 

;;  Thia  function  looks  up  the  binding  of  a  symbol  in  the  handler.  If 
;;  it  ia  not  found  there/  the  global  bindings  are  checked. 

(dafun  Lookup-Binding  (Nasta  Handlar) 

(or  (Haah-Lookup  (Handler-Bindings  Handlar)  Name) 

(Hash-Lookup  'Global-Bindings*  NasM))) 

;;  Thia  function  returns  tha  number  of  instructions  in  a  handlar. 

(defun  Numbar-Of-Inatructions  (Handler) 

(array-total-sita  (Handler-Instructions  Handler))) 

;;  Thia  function  returns  the  handler  object  for  the  handler  name.  If 
;;  the  handler  does  not  exist,  an  error  massage  is  printed. 

(defun  Get -Handler  (Name) 

(let  ( (Handler  (get  Name  'Handler))) 

(if  (null  Handler) 

(break  *Pisim  arror:  unknown  handler*) 

Handler) ) ) 

;;  This  function  daterminea  the  number  of  instructions  in  a  sequence 
;;  of  statements  and  builds  a  instruction  array  of  the  correct  size. 

;;  It  then  reads  each  atatement.  If  it  is  an  inatruction,  it  ia 
;;  inaartad  into  the  array.  If  it  ia  a  label,  the  label  and 
;;  atatement  index  ia  inserted  into  the  handler's  bindings. 

(dafun  Make-Inatnictiona  (Statements  Handlar) 

(let  (Inatructiona) 

(loop  for  Statament  in  Statements 
unless  (Label?  Statement) 
count  Statement 

into  Humber-Of-8tatementa 
finally  (aatf  Instructions 

(maka-array  Number-of-Statementa) ) ) 

(loop  with  Indax  a  0 

for  Statamant  in  Statamenta 
when  (Label?  Statement) 

do  (Insert -Binding  Statement  Index  Handler) 
when  (Instruction?  Statament) 

do  (aetf  (aref  Inatructiona  Index) 

Statement) 

(incf  Index)) 

(aetf  (Handler-Instructions  Handlar) 

Instructions) ) ) 

;;  Thia  function  indaxaa  tha  paramatara  and  locals  in  a  handlar. 

;;  Thia  includaa  aaaigning  a  aach  paramatar  and  value  an  index  in  the 
;;  hendler  segment.  These  assignments  are  included  in  the  handler's 
;;  bindings.  Tha  arity  and  number  of  locals  paramatara  are  also  set. 

(dafun  Index-Parameter a-And-Local a  (Paramatara  Locals  Handlar) 

(loop  for  Paramatar  in  Paramatara 
for  Indax  from  2 

doing  (Inaart -Binding  Parameter  Index  Handler)) 

(loop  for  Local  in  Locals 

for  Indax  from  (•»>  (length  Paramatara)  2) 
doing  (Inaart -Binding  Local  Index  Handler)) 

(aetf  (Handler-Arity  Handler) 

(length  Parameters)) 

(aetf  (Handler-Nuaiber-Of-Locala  Handler) 

(length  Locals))) 

;;  Thia  function  reads  a  handler  from  an  expression.  The  resultant 
;;  handler  is  stored  on  the  property  list  of  the  handler  name. 

(defun  Read-Handler  (Expression) 

(let  ( (Nama  (firat  Bxprasaion)) 

(Paramatara  (sacond  Expression)) 

(Locals  (third  Expression)) 

(Statamenta  (nthedr  3  Expression} } 

(Naw-Handlar  (Naka-Handlar) ) ) 

(aatf  (Handlar-Hama  Naw-Handlar)  Name) 

(Indax-paramatars-Aitd-Locals  ParaMtara  Locals  Naw-Handlar) 
(Maka-Inatructions  Statamanta  Naw-Handlar) 

(aatf  (gat  Nama  'Handlar)  Naw-Handlar) ) ) 

;;  This  allows  tha  dafinltion  of  handlara.  Thia  should  k>a  part 
;;  of  a  mora  general  reader. 
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(d«fun  D«fiiM*H«ndl«r  (4r«st  lxpr«Mion} 

(Otbug-Print  0  *<-4lo«din0  handler  (first  Bxprsssicm} ) 

(Rssd'Msndlsr  Brprsssion) 

nil) 


NodaU 

;;  litis  allows  tha  dafinition  of  nodals  (noda  variablaa) .  An 
;  indax  is  assignad  (using  tha  nuabar  of  axiating  nodals) .  A 
;;  naw  global  binding  is  addad. 

(dafun  Daf ina-Nodal  (Mana) 

(Dabug-Print  0  **>&dafining  nodal  -a*4*  Maas) 

(cond  ((not  (null  (Hash-Lookup  *Qlobal-Bindings*  Maaa) ) ) 

(foraat  t 

■-^Warning:  -a  has  alraady  baan  dafinad  globally*^* 

Maaa) ) 

(t 

(Insart -Binding  Naas  *Nodal-Count*  'Global) 

(incf  *Nodal-Count*) ) ) ) 

; ;sssaas3ssssss«ssssxcsssx«ssas«aaaaasaB*«sas*ssssa««s«ssasssasas 

Constants 

;;  ‘ntis  allows  tha  dafinition  of  global  constants.  Tfia  binding 
;;  is  addad  to  tha  global  bindings. 

(dafun  Dafina-constant  (Haaa  Valua) 

(Dabug-Print  0  **4dafining  constant  -a<-4*  Maas) 

(Insart -Binding  Naaa  Valua  'Global)) 


;;  A  nastad  axprassion  (a  list)  is  in  tha  fora  (^abol  argl  arg2...}. 
;;  In  this  easa,  Apply-Oparation  is  racursivaly  eallad. 

(dafun  Bvaliiata  (Activa-Task  Bxprassion) 

(whan  (a^paal  (Task-Status  Activa-Task) 

'lUBWINO) 

(typaeasa  Bxprassion 
((or  ntMbar  string) 

BxprassiMt) 

(^abol 

(or  (Lookup-Binding  Bxprassion  (Task-Handlar  Activa-Task) ) 
Bxprassion) ) 

(list 

(Apply-Cparation  (first  Bxprassion) 

Activa-Task 
(rast  Expression) ) ) 

(otharwiaa 

(braak  *PiSiB  error:  unknown  ai^rassion*) ) ) ) } 

;;  This  function  returns  tha  operation  function  for  tha  operation 
;;  nasw.  If  tha  operation  does  not  exist/  an  error  Mssaga  is 
it  printed. 

(dafun  Gat-Gperation  (Mssm) 

(let  ((Operation  (gat  NasM  'Operation))) 

(if  (null  Operation) 

(braak  *Pi8iB  error:  unknown  operation*) 

Operation) ) ) 

;;  This  is  used  to  define  processor  operations. 


;;  Instructions 

; ;  This  function  returns  tha  next  instruction  of  tha  handler  to  be 
executed.  Tha  currant  instruction  pointer  (IP)  is  obtained 
;;  from  tha  task.  Tha  instructions  are  obtained  froai  tha  handler. 
;;  Tha  task  instruction  pointer  is  incranantad.  Note:  tha 
;;  instruction  pointer  is  incroBantad  APT8B  tha  next  instruction 
;;  is  fetched. 

(dafun  Next-Instruction  (Task) 

(let  ((IP  (Taak-IP  Task))) 

(whan  (>s  IP 

(NuBber-Of-lnstruetions  (Task-Handlar  Task) ) ) 

(braak  "PiSiB  error:  IP  out  of  range*)) 

(incf  (Task-lP  Task) ) 

(araf  (Handler-Instructions  (Task-Handlar  Task)) 

IP))) 

;;  This  function  executes  a  single  instructions.  It  first 
;;  locates  tha  next  instruction  using  tha  task  instruction 
;;  pointer.  Tha  instruction  pointer  is  inersBantad.  Than  it 
;;  applies  tha  operation  to  tha  arguBants. 

(defun  Bxacuta-Next-Instruction  (Activa-Task) 

(let  ((Instruction  (Next-Instruction  Activa-Task) ) ) 

(Debug-Print  2  *  (executing  instruction  -a]  <>4* 

(first  Instruction}} 

(Log-Instruction  Instruction) 

(Apply-Oparation  (first  Instruction) 

Activa-Task 

(rast  Instruction) ) ) } 


Operations 

;  This  function  applies  a  processor  operation  to  a  list  of 
;  arguBants.  Bach  arguaent  is  evaluated  before  tha  operation 
;  is  applied.  Tha  epply  only  takes  place  if  tha  task  status 
;  is  'RUNfIHG. 


(defBaero  Dafine-Oparation  (Neae  fcrast  Rast) 

(satf  (gat  '/NaBa  'Operation) 

•'(laBbda  /BRast))) 

;;  DalMigging 

it  This  prints  debug  Bassages  depending  on  the  debug  level. 

(defBaero  Oabug-Print  (Laval  PorBat  fcrest  ArguBsnts) 

'(«dian  (<B  /Laval  *Debug-Lavel*) 

(foraat  t  /Poraat  /BArgtaunts) ) ) 

it  This  function  eats  tha  debug  level. 

(defun  Sat -Debug -Laval  (New-Level) 

(eetq  *Dabug-Level*  New-Level)) 


;  bogging 

;;  This  predicate  starts  a  new  log,  saving  tha  currant  log. 

(defun  Start  New-Log  () 

(eetq  *Log* 

(Make-Log  :Typa  (Log-TVpa  *Log*) 

:Old-Logs  *Log*))) 

it  This  is  used  in  a  counting  profile.  Tha  category  count  is 
;;  ineraswntad,  or  created,  if  non-existent. 

(defun  Collect-Profile  (Category  Profile) 

(if  (Hash-Lookup  Profile  Category) 

(Hash-Insert  Profile 
Category 

(!♦  (Hash-Lookup  Profile  Category))) 

(Hash-Insert  Profile  Category  1))) 

it  This  predicate  tests  if  logging  is  enabled.  If  tha  log  is  nil,  logging 
if  is  on. 


(dafun  Apply-Oparation  (Operation  Active- Task  ArguBants) 

(let  ( (ArguBsnt-List 

(loop  for  ArguBsnt  in  ArguBants 

collecting  (Bvaluate  Active-Task  ArguBint)))) 

(«#han  (equal  (Task-Status  Active-Task)  'RUMflNO) 

(Log-operation  operation) 

(push  Active-Task  Arguswnt-List) 

(apply  (Get -operation  Operation) 

ArguBsnt-Llst) ) ) ) 

;;  This  function  evaluates  tha  ai^ression  and  returns  the  results. 
;;  This  is  an  evaluator  appropriate  for  the  Halted  expressiws 
it  in  a  Pi  prograa.  Expressions  are  only  evaluated  if  the  task 
;;  status  is  'RQMIINO.  The  following  expression  types  are 
;;  possible: 

A  nuBber  or  string  returns  the  valua  of  the  number  or  string. 

: : 

;;  A  syBbol  is  looked  up  in  the  handler  bindings.  If  it  is 
;;  present,  the  corresponding  valua  is  returned.  Otherwise,  tha 
;  synbol  is  returned. 


(dafun  Logging?  () 

(not  (or  (null  *log*) 

(equal  (Log-Type  *Log*)  'None)))) 

;;  This  function  logs  tha  pacified  task.  Presently,  profiles  of  task  types 
;;  and  status'  are  aaintained. 

(defun  Log-Task  (Task) 

(id»en  (Logging?) 

(Collect -Profile  (Task-Status  Task) 

(Log-Task -Status-Profile  *Log*)) 

(whan  (equal  (Task-Status  Task)  'New) 

(Collect-Profile  (Handler  Mima-Of  Task) 

(Log-Task-Type-Profile  *Log*))))} 

;;  This  function  collects  statistics  on  instruction  types. 

(dafun  Log-Instruction  (Instruction) 

(id>en  (Logging?) 

(cond  ((not  (equal  (first  Instruction)  'Write)) 
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(collect 'Profil*  (first  Instruction) 

(Loy-Instruccion-TVps-Profii#  *Log*i)) 
((not  (listp  (fourth  Instruction))} 

(Collsct-Profil*  'Initialize 

(Log-Instruction-Type-Prof ile  ‘Log*))) 
((equal  (first  (fourth  Instruction)}  'Read) 
(Collect-Profile  'Move 

(Log-Instruction-Type-Profile  •Log*))) 
(t 

(Collect-Profile  (first  (fourth  instruction)) 
(Log-Instruction-Type-Profile 

•Log*)))))) 

;;  This  function  creates  an  operation  profile. 

(defun  Log-Operation  (Operation) 

(vrtien  (Logging?) 

(Collect-Profile  Operation 

(Log-Operation-Type-Profile  *Log*)))) 

; ;  This  function  searches  down  a  sorted  list  of  deltas  looking 
;;  for  an  entry  at  a  specified  tine.  If  such  an  entry  is  found, 
;;  its  value  is  adjusted  by  Change.  If  no  such  value  is  found, 
;;  a  new  delta  is  created  an  inserted  at  the  correct  position 
;;  in  the  list. 

(defun  Adjust -Concurrency-List  (Tise  Change) 

(when  (Logging?) 

(let  ((Concurrency-List  (Log-Concurrency-List  •Log*))) 

(cond  ((or  (null  Concurrency-List) 

(<  Time 

(Delta-Tine  (first  Concurrency-List)))) 

(push  (Make-Delta  :Tine  Tine 

:Value  Change) 

(Log-Concurrency-List  *Log*))j 
( (s  Time 

(Delta-TlsM  (first  Concurrency-List))) 

(incf  (Delta-Value  (first  Concurrency-List) ) 

Change ) ) 

(t 

(Adjust-Rest -Of -Concurrency-List 
TisM  Change  Concurrency-List) ) ) ) ) ) 

;;  This  is  the  recursive  part  of  Adjust -Concurrency-List. 

(defun  Adjust -Rest -Of -Concurrency-List  (Tine  Change 

Concurrency-List) 

(cond  ((or  (null  (rest  Concurrency-List/ / 

(<  Tine  (Delta-Tine  (second  Concurrency-List)))) 
(rplacd  Concurrency-List 

(cons  (Make-Delta  :Tine  Tine 

:Value  Change) 

(rest  Concurrency-List)))) 

( (s  Time 

(Delta-Tine  (second  Concurrency-List))) 

(incf  (Delta-Value  (second  Concurrency-List)) 

Change) ) 

(t 

(Adjust -Rest -Of -Concurrency-List 
Tine  Change  (rest  Concurrency-List))))) 

;;  This  function  prints  the  information  from  the  current  log. 

(defun  Print -I^-Jnformation  () 

(when  (or  (equal  (Log-Type  *Log*)  'All) 

(equal  (Log-iype  *Log*)  'Profile)) 

(Print -Profile-Date) ) 

(«dien  (or  (equal  (Log-Type  *Log*)  'All) 

(equal  (Log-rype  *Log*)  'Plot)) 

(Plot -Concurrency) ) ) 

Randoms 

;;  This  function  estimates  the  delivery  delay  of  a  message,  it 
;;  should  be  better  than  it  is  now. 

(defun  Delirery-Delay  (Source  Destination  Length) 

(tdien  (or  (>«  Source  (Nunber-Of -Nodes) ) 

(ninusp  Source) 

{>•  Destination  (Nunber-Of-Nodes) ) 

(ninusp  Destination) ) 

(break  *PiSin  error:  illegal  node  number*)) 

(when  (or  (ninusp  Length) 

(zerop  Length)) 

(break  *Pi8im  error:  illegal  message  length*)) 

(loop  for  Dimension  in  *Maehine-Dinensions* 
collecting  (nod  source  Dimension) 
into  Source-Components 
doing  (setq  Source  (floor  Source  Dimension)) 
collecting  (mod  Destination  Dimension) 
into  Destination-Components 
doing  (setq  Destination  (floor  Destination  Dimension)) 


finally  (return 

(loop  for  source -Component 

in  Source-Components 
for  Destination-Conponent 
in  Desti nation -Component a 
summing  (aba  (-  Source-Component 

Destination-Component) } 

into  Distance 

finally  (return  (♦  Distance  (-  Length  !))))))) 

;;  'This  function  injects  a  starting  message  into  the  machine,  it 
;;  sterts  calculating  the  message  length  and  destination.  The 
;;  message  is  then  enqueued,  and  events  are  axecuted  until  the 
;;  event  queue  is  Mpty. 

(defun  Inject  (Type  fcrest  Arguments) 

(Make-Nodes) 

(Clear-Nodes) 

(Claar-Bvant-Queua) 

(let*  ((Handler  (Get -Handler  Type) ) 

(Length  (♦  (Handler-Arity  Handler) 

(Handler-Number-Of-Locals  Handler) 

2)) 

(Destination  (random  (Number-Of-Nodes) } ) 

(Arrival -Tine  (Node-Tine  (Translate-Node  Destination))) 

(Message  (Make-Message  :De8tination  Destination 
: Length  Length 
:Type  Type 

:Argunents  Arguments) ) 

(Event  (Make-Bvent  :Tine  Arrival-Tine 
: Object  Message) ) ) 

( Enqueue -B vent  Event) 

(loop 

(cond  ((null  *B vent -Queue*} 

(return) ) 

(t 

(Execute-Next -Event) ) ) ) } ) 

;;  Hash  Table  Functions 

(def constant  MlN.JtASH_TABLB_SIZE  11) 

(defstruct  Entry 

(Key  nil  :type  symbol) 

(Value  nil  :type  any)) 

(defseruct  HashTable 

(Nun-buckats  nil  :type  intagar) 

(Nunbar-Sntrias  nil  :type  intagar)  * 

(^ckats  nil  :type  arri^) ) 

;;;  Ihls  function  inserts  a  entry  into  the  hash  table.  If  a  bucket 
;;;  collision  occurs,  the  entry  is  inserted  in  the  list  in  increasing  key 
;;;  order.  If  e  key  collision  occurs,  the  older  entry  is  overwritten. 

;;;  This  function  also  Incrasses  the  hash  table  aize  if  necessary. 

(dafun  Hash-Insart  (Table  Key  value) 

(let*  ((Index  (Hash-Function  Key 

(HashTable-Nun-Buckets  Table) ) ) 
(Bucket-List  (eref  (HeshTable-Buc)(sts  Table) 

Index) ) ) 

(cond  ((or  (null  Bucket-List) 

(string<  Key  (Entry-Key  (car  Bucket-List)))) 

(push  (Meke-Bntry  :Key  Key 

:Value  Value) 

(eref  (HeshTable-Buckets  Table) 

Index) ) 

(setf  (HeshTable-Nunber-Bntries  Table) 

(14^  (HashTeble-Nunber-Bntries  Table)))) 

<t 

(let  ((This-Entry  (car  Bucket-List) ) } 

(cond  Mstrings  Key  (Entry-Key  This-Entry)) 

;;  if  Key  s  key  of  This-Entry,  then  overwrite  older 
;;  bucket  entry.  (New  bucket  has  sans  Key  as  oldar 
;;  Buckat  entry,  but  new  entry  value.) 

(format  t  "-SBashing  older  bucket  entry  -A.* 
Ihis-Entry) 

(aetf  (Entry-Value  This-Entry) 

Value) ) 

(t 

( spl i ce - I n -Bucket 

Key  Value  Bucket-List  Table))))))) 

(if  <>«  (HeshTeble-Nunber-Bntries  Table) 

(HeahTable-Nun-Buckets  Table)) 

(Hesh-Reaize  Table) 

Table))) 

(defun  8plice-ln-Bucket  (Key  Value  Bucket-List  Table) 

(let*  ((Next-List  (odr  Bucket-List) ) 

(cond  ((or  (null  Next-List) 

(string<  Key  (Entry-Key  (car  Next-List) )) ) 

(rplaod  Bucket-List 


2S4 


(eons  (Msko^lncry  tKoy  Koy 

ivsluo  Vsluo) 

Next-List) ) 

(sstf  (HsshTsbls -Nuabsr-lntriss  Tsbls) 

(1<*>  (HsshTsbls-Nua^r-Intriss  Tsbls) )) ) 

(t 

(1st  ((This-Bntxy  (csr  Nsxt-List) ) ) 

(cond  ((string*  Ksy  (Bntry-Rsy  Ttiis-Bntry)) 

;;  if  Ksy  «  ksy  of  Ttiis-Bntry,  thsn  ovsrwrits 
;;  oldsr  bucket  sntxy's  vslus. 

(forvst  t  *‘-LBsshing  older  bucket  entry  -A.* 
lliis-Intry) 

(setf  (Bntry-Vslue  lliis-Bntry) 
vslue) ) 

(t 

(Splice-ln-Bucket 

vslue  Next-List  Tsble) )))))) ) 

;;;  This  function  resizes  the  hssh  tsble  end  rehsshes  the 
;;;  entries.  The  hssh  tsble  size  is  approx imstely  doubled. 

(defun  Hssh-Resize  (Tsble) 

(let*  ((Old-Buckets  (HsshTsble-Buckets  Tsble)) 

(Old-Size  (HsshTsble-Nun-Buckets  Tsble) ) 

(New-size 

(Detemine-Hssh-Tsble-Size 

(*  (HsshTsble-Nua-fiuckets  Tsble)  2)))) 

(setf  (HsshTsble-Nua-Buckets  Tsble) 

New-size) 

(setf  (HsshTsble-Buckets  Tsble) 

(Nske-Hssh-Buckets  Ne«r-Size)) 

(setf  (HsshTsble-Nuaber-Entries) 

0) 

(Copy-Over-Buckets  0  Old-Size  Old-Buckets  Tsble) 

Tsble) ) 


(dsfun  Mssh-Oelete  (Tsble  Rey) 

(let*  ((Index  (Hs^-Punetion  Rey 

(HsrtiTsble^lusi-Buckets  Tsble) } ) 
(Bucket-List  (sref  (HsshTsble-Buckets  Tsble) 
index) ) ) 

(if  (null  Bucket-List) 

Tsble 

(let  t(This-Bntry  (csr  Bucket -List) ) ) 

(cond  ((string>  Rey  (Bntry-Rey  This-Bntry}) 

(Splice-Out -Bucket  Rey  Bucket-List  Tsble)) 

((string*  Rey  (Bntry-R^  This-Bntry)) 

(setf  (sref  (HsshTsble-Buciiets  Tsble) 

Index) 

(cdr  Bucket-List)) 

(setf  (HsshTsble-Nusber-Entries  Tsble) 

(1-  (HsshTsble-Nunber-Entries  Tsble)))) 

(t  ;;  Rey  string<  key  of  This-Entry.  so  Rey  isn't  found 
Tsbls) ) ) ) ) ) 

(dsfun  Splice-Out -Bucket  (Rey  Bucket-List  Tsble) 

(let  ((Next -List  (cdr  Bucket-List))) 

(if  (null  Next-List) 

Tsble  ;;  fell  off  end  of  )3ucket  list.  Rey  not  found 
(let  ((Ihis-Bntry  (csr  Next-List) ) ) 

(cond  ((string>  Rey  (Bntry-Rey  Ihis-Entry) ) 

(^lice-Out-Buc)cet  Rey  Next -List  Tsble)) 

((string*  Rey  (Bntry-Rey  This-Bntry)) 

(rplsod  Bucket-List 

(cdr  Next-List) ) 

(setf  (HsshTsble-Huaber-Bntries  Tsble) 

(1-  (HsshTsble-Niaber-Bntries  Tsble) )) ) 

(t  ;;  Rey  string<  Rey  of  This-Bntry.  Rey  not  found 
Tsble) ) ) ) ) 

;;;  This  function  clesrs  for  sll  entries  in  the  specified  hssh  tsble. 


(defun  C^y-Over-Buckets  (Index  Old-Size  Old-Buckets  Tsble) 
(cond  ((>«  index  Old-Size) 

Tsble) 

(t 

(let  ((Bucket-List  (sref  Old-Buckets  Index))) 
(Copy-Over-Bucket  Bucket -List  Tsble] 

(Copy -Over-Buckets 

(14  Index)  Old-Size  Old-Buckets  Tsble))))) 

(defun  c<^-Over-Bucket  (Bucket-List  Tsble) 

(cond  ((null  Bucket-List)  Tsble) 

(t 

(let  ((This-Bntry  (csr  Bucket-list) ) ) 

(Hssh-Insert  Tsble 

(Bntry-Rey  This-Bntry) 

(Bntry-Vslue  This-Bntry)) 
(Cqpy-Over-Bucket  (cdr  Bucket-List)  Tsble) 

;;  This  functi'^r  crestes  s  hssh  tsble  hsving  the  specified  f  of 
;;  buckets.  Since  the  size  of  s  hssh  tsble  aust  be  s  priae 
;;  nuaber.  the  ^ecified  nuaber  of  buckets  is  rounded  ^  to  s 
;;  nesrby  priae.  The  neu  tsble  is  then  initislized. 

(defun  Nske-Mssh-Tsble  (Soptionsl  Niai-Buckets) 

(let  ( (Size  (Oeteraine-Hssh-Tsble-Size 

(or  Nua-Buckets  MIN_HA8H_TABLB.8IZB) ) ) ) 
(Hske-HsshTsble  :Nua-Buckets  Size 

{Buckets  (Mske-Hssh-Buckets  size) 
:Nuiter-Bntries  0))) 

;;‘n)is  function  crestes  end  initislizes  s  bucket  srrsy. 

(defun  Nske-Hssh-Buckets  (Size) 

(aske-srrsy  Size}) 

;;;  This  function  looks  up  s  key  in  the  hssh  tsble.  If  it  is 
;;;  found,  the  entry  pointer  is  returned.  Otherwise,  nil  is 
;;;  returned. 


(defun  Hssh-Lookt^  (Tsble  Rey) 

(let*  ((Index  (Kssh-PunetiMi 

Rey  (HsshTsble-Nua-Buckets  Tsble) ) ) 
(Bucket-List  (sref  (HsshTsble-Buckets  Tsble) 

Index) ) ) 

(loop 

(cond  ((or  (null  Bucket-List) 

(string<  Rey 

(Bntry-Rey  (osr  Bucket-List})}) 


(return  nil) ) 

((string*  Rey 

(Bntry-Rey  (esr  Bucket-List))) 
(return  (Bntry-Vslue  (csr  Bucket-List) )) ) 
(t 

(setq  Bucket -List  (odr  Bucket-List) ))))) } 


(defun  Clesr-Hssh-Tsble  (Tsble) 

(let  ((Size  (HsshTsble-Nua-Buckets  Tsble))) 

(setf  (HsshTsble-Nua-Buckets  Tsble)  Size) 

(setf  (HsshTsble-Nuaber-Bntries  Tsble)  0) 

(setf  (HssliTsble-Buckets  Tsble)  (Mske-Hssh-Buckets  Size) )) ) 

;;;  This  function  picks  the  first  priae  number  greeter  then  or  equsl  to 
;;;  the  specified  size  estiaste.  The  ainiaua  hssh  tsble  size  is  enforced 
;;;  here. 

(defun  Oeteraine-Hssh-Tsble-Slze  (Size-Bstimste  fcsux  Size) 

(if  (<  Size-Bstiaste  MIN.JIAS1CTABLB.SIZB) 

(setq  Size  M2NJ(ASICTABLB.SIZB) 

(setq  Site  Size-Bstiaste)) 

(if  (*  (nod  Size  2)  0) 

(setq  Size  (!♦  Size))) 

(loop 

(if  (null  (Priae-Niaiber-Test  Size)) 

(setq  size  (♦  Size  2)) 

(return) ) ) 

Size) 

(defun  Priae-Nuaber-Test  (Nuaber) 

(let  ((index  3)) 

(cond  ((*  Number  2)  t) 

((*  (mod  Nuaber  2)  0)  nil) 

(t 

(loop 

(cMtd  ( (<*  (Square  index)  Nwber) 

(if  (*  (aod  Nuaber  index)  0) 

(return  nil) ) 

(setq  index  (♦  Index  2))) 

(t  (return  t)))))))) 

(dsfun  Square  (n) 

(•  n  n)) 

;;;  This  function  calculates  s  hssh  tsble  index  from  s  key 
;;;  (syBbel->string)  and  the  hash  tsble  size. 

(defun  Hash-Function  (Rey  Size) 

(let*  ((Sub  0) 

(Rey-String  (string  Rey)) 

(Length  (1-  (string-length  Rey-String) )) ) 

(loop 

(cond  ((<  Length  0)  (return)) 

(t 

(setq  Sub 

(4  Sum  (chsr-int  (sref  Rey-String  Length)))) 

(setq  Length  (1-  Length))))) 

(aod  Sum  Size)}) 


;;;  This  function  deletes  an  entry  in  the  hssh  tsble. 


Synt«x:Coiaion-Lisp;  llod«:LISP;  Ba««:10;  P«clc«g«:USER  •*> 


(ArguMnta  nil) ) 


;  ;  ssaasssxBSSsssajisasssasmsssvsssssBBaBBsssaassssssss: 

;;  Pi  Siaulator  —  functional  varaion 


Global  variablas 

(dafconatant  *Machina-DlMn«iona*  '(4  4  4) 

*thia  i«  tha  aiachina  diMnaiona'} 


(dafatruct  Xnatruecion 
(Op  nil) 

(Arga  nil) ) 

; ;  Nodaa 

;;  n^ia  tranalataa  a  noda  ID  to  a  noda. 


(dafvar  *Evant-Quaua*  nil 

*thia  ia  tha  global  avant  quaua”) 

(dafvar  *Nodaa*  nil 

*thia  ia  tha  noda  array*) 

(dafvar  *Global-Binding8*  (Maka*Hash-Tabla) 

*thaaa  ara  tha  bindinga  for  nodala,  conatanta<  ate.*) 

(dafvar  *Nodal'Count*  0 

*Thi8  ia  tha  nuaibar  of  dafinad  nodala*) 

(dafvar  *Dabug-Laval*  0 

*thia  ia  tha  dabugging  laval*) 

(dafvar  *Log*  nil 

*thia  ia  tha  logging  inforvation*) 

(dafvar  *Glo)Mil-Pliat*  nil 

*Tha  global  pr^>arty  liat.*) 

; ;  Structuraa 

(dafatruet  Noda 
(TiM  0) 

(ID  0) 

(Sagaanta  (Malca-haah-Tabla) ) 

(Nodala  nil) ) 

(dafatruct  Sagaant 
(Typa  nil) 

(Data  nil) 

(Siaa  0) ) 

(dafatruct  Taa)c 
(Handler  nil) 

(Noda  nil) 

($agaant  nil) 

IIP  0) 

(Statua  'Naw)) 


(dafun  Tran  a  lata -Noda  (Noda-ID) 

(araf  *Nodaa*  Noda>iD) ) 

;;  niia  function  ratuma  tha  nuabar  of  nodaa. 

(dafun  Nuabar-Of -Nodaa  0 
(array-total-aiza  *Nodaa*)) 

(dafun  Copy-Raplaca-Noda  (Naw-Noda  ID  Nodaa) 

(Copy-Raplaca-Blt  Naw-Noda  ID  Nodaa) ) 

;;  Diia  function  craataa  tha  node  array  according  to  the  diaanaion 
; ;  conatant . 

(dafun  Naka-Hodaa  () 

(lat*  { (Nuabar-Of -Nodaa  (apply  •'*  *Machina-Diaanaiona*) } 

(Nodaa  (aaka-array  Nuabar-Of -Nodaa) ) 

(ID  0) 

(Noda  nil) 

(Nodala-Sagaant  NIL)) 

(Maka-Nodaa-1  Nuabar -of -Nodaa  Nodaa  ID  Node  Nodala-Sagaant))) 

(dafun  Maka-Nodaa-1  (Nuabar-of -Nodaa  Nodaa  ID  Noda  Nodala-Sagaant) 

(cond  ((not  (<  ID  Huabar-Of -Nodaa) ) 

(aatq  *Nodaa*  Nodaa)) 

(t 

(aatq  Noda  (Naka-Noda  :1D  ID}) 

(aatq  Nodala-Sagaant  (Craata-Raad-Writa-Sagaant  100)) 

(aatq  Nodaa  (Copy-Raplaca-Noda  Noda  ID  Nodaa)) 
(aultipla-valua-bind  (Sgat-ID  Intaraadiata-Noda) 

(Add-Sagaant  Nodala-Sagaant  Noda) 

(aatq  Noda 

(Maka-Noda  :Tiaa  (Noda-Tiaa  Intazaadiata-Noda) 

:ID  (Noda-ID  Intaraadiata-Noda) 
tSagaanta  (Noda-Sagaanta  Intazaadiata-Noda) 
;Nodala  Sgat-lO)) 

(aatq  Nodaa  (Copy-Raplaca-Noda  Noda  (Noda-IO  Noda)  Nodaa))) 
(liaka-Nodaa-1  Nuabar-of-Nodaa  Nodaa  (4  ID  1)  Node 
Nodala-Sagaant) ) ) ) 

;;  Thia  function  tha  noda  tiaa  and  claara  tha  noda  aagaant. 


(dafatruct  Naaaaga 
(Oaacination  nil) 

(Length  0) 

(Typa  nil) 

(ArguBMnta  nil) ) 

(dafatruct  Event 
(Tina  0) 

(Object  nil) ) 

(dafatruct  Handler 
(Nana  nil) 

(Inatructiona  nil) 

(Arity  0) 

(Nunbar-Of-Locala  0) 

(Bindinga  (Naka-Haah-Tabla) ) ) 

(dafatruct  D-Sync 

(Suapandad-Taaka  nil)) 

(dafatruct  B-sync 
(Count  0) 

( Suapandad-Taaka  nil)) 

(dafatruct  Log 
(Type  'AID 

(Taak-Statua-Profila  (Naka-Haah-Tabla) ) 
(Taak-iypa-Profila  (Naka-Haah-Tabla) ) 
(Inatruction-Typa-Profila  (Naka-Haah-Tabla) ) 
(Oparation-'Typa-Profila  (Naka-Haah-Tabla) ) 
(Concurrancy-Liat  nil) 

(Old-Loga  nil) ) 

(dafatruct  Delta 
(Tina  0) 

(Value  0)) 

(dafatruct  Taak-flagnant 
(Storaga-R^ta  0) 

(Typa  nil) 


(dafun  Claar-Nodaa  () 

(lat  ( (Noda  nil) 

(Nodaa-Indax  0) 

(Nodala-ld  nil) 

(Nodala  nil) 

(Bnd-Indax  (array-total-aiza  *Nodaa*))} 

(Claar-Nodaa-1  Noda  Nodaa-lndax  Nodala-Id  Nodala  End-Index))) 

(dafun  Claar-Nodaa-1  (Noda  Nodaa-Indax  Nodala-Id  Nodala  End-Index) 

(cond  ((not  (<  Nodaa-Indax  End-Index)) 
nil) 

(t 

(aatq  Node  (araf  *Nodaa*  Nodaa-Indax)) 

(aatq  Nodala-Id  (Noda-Nodala  Node)) 

(aatq  Nodala  (Tranalata-Sagaant-On-Noda  Nodala-ld  Node)) 

(aatq  Noda  (Naka-Noda  :TiM  0  ;;  (aatf  (Noda-Tisa  Node)  0) 

;1D  (Noda-ID  Noda) 
iSagpenta  (Noda-Sagaanta  Noda) 

:Nodal8  (Noda-Nodala  Node))) 

(aatq  *Nodaa*  (Copy-Raplaca-Noda  Node  (Noda-ID  Noda)  *Nodaa*)) 
(aatq  Noda 

(Naka-Noda  :Tlaa  (Noda-Tiaa  Noda) 

:1D  (Noda-ID  Noda) 

iSagaanta  (Claar-Haah-Tabla  (Node-Sagaanta  Node) ) 
sNodala  (Noda-Nodala  Node))) 

(aatq  *Noda8*  (Copy-Raplaca-Noda  Noda  (Noda-ID  Noda)  *Nodaa*)) 
(aatq  Noda  (Naka-Noda  :Tiaa  (Noda-Tiaa  Noda) 

:ID  (Noda-ID  Noda) 

;8egaanta  (Haah-lnaart  (Noda-Sagaanta  Node) 
Nodala-lD 
Nodala) 

:Nodala  (Noda-Nodala  Noda) ) ) 

(aatq  *Nodaa*  (Copy-Raplaca-Noda  Noda  (Noda-ID  Noda)  *Noda8*)) 
(lat*  ((Data  (SagNant -Data  Nodala) ) 

(Index  0} 

(Data-Siza  (array-total-aisa  Data))) 

(Claar-Noda8-2  Data  Index  Data-sisa) ) 

(aatq  Nodaa-Indax  (I4  Nodaa-Indax)) 

(Claar-Nodaa-1  Noda  Nodaa-Indax  Nodala-ld  Nodala  End-lndax) ) ) ) 
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(daCun  Cl«ar'Moclas*2  (Data  indax  Data-Slaa) 

(cond  ( (not  (<  Xndax  Data-Sixa) ) 
nil) 

(t 

(a«tq  Data  ( Copy -Raplaca- Bit  'UNBOUND  Indax  Data)) 
(aatq  indax  (1-f  Indax)) 

(Claar~Nodaa-2  Data  Indax  Data^Siza) ) ) ) 


(valuaa  Naw-'Valua 

(MaBa^Sagaant  :Siza 
:Typa 


:Data 


(Saqaant-Siza  Sagaant) 

(8agaant-TVpa  Sagaant) 

(Copy*Raplaca-Elt  Naw-valua 
Of faat 

(Sagaant 'Data  Sagaant))))) 


;;  Ttiia  function  attaapta  to  aatch  a  liay  in  an  aaaociaciva  aat  or  cacha 


;  ;  xaaassssssssssssssxsxsaissmsasKssmxaassssmsaasssaaxaxssssssas: 

Sagaanta 

;;  Thia  adda  a  sagaant  to  tha  noda's  sagaant  translations.  It 
;;  raturns  tha  uniqua  sagaant  ID. 

(da fun  Add-Sagaant  (sagaant  Noda) 

(lat*  ((Sagaant'lD  (gansya  *Sagaant>*)} 

(Naw-Sagaants 

(Hash-Insart  (Noda-Sagaants  Noda) 

Sagaant'ID 
Sagaant) ) 

(NaW'Noda 

(Ma)ca-Noda  :Tiaa  (Noda-Tiaa  Noda) 

:ID  (Noda-ID  Noda) 

:Sagaants  Naw-Sagaants 
:Nodals  (Noda-Nodals  Noda) ) ) ) 

(valuas  Sagaant'lD  Naw-Noda) ) ) 

;;  *niis  raaovas  a  sagaant  ID  froa  tha  noda's  sagaant 
;;  translations. 

(dafun  Dalata'Sagaant  (Sagaant'lD  Noda) 

(lat*  ( (Naw-sagaants 

(Hash'Dalata  (Noda-Sagaants  Noda) 

Sagaant -ID) ) 

(Naw'Noda  (Maka-Noda  :Tiaa  (Noda^Tiaa  Noda) 

:1D  (Noda'IO  Noda) 

:  Sagaanta  Naw-Sagaants 
sNodals  (Noda-Nodals  No^)))) 

Naw'Noda) ) 

;;  This  translatas  a  sagaant  ID  to  a  sagaant  on  tha  spacifiad 
;;  task's  noda. 

(dafun  Translata-Sagaant  (Sagaant'lO  Task) 
(Translata'Sagaant'On'Noda  Sagaant'lD 

(Task'Noda  Task) ) ) 

;;  This  translatas  a  sagaant  ID  on  a  ipaeifiad  noda. 


(dafun  Natch'Sagaant  (Sagaant  Kay) 

(casa  (Sagaant'TVpa  Sagaant) 

(Associativs'Sat 

(Hash-Lookup  (Sagaant-Data  Sagaant)  Rn') ) 

(Cacha 

(Match'Cacha  Ray  Sagaant)) 

(othazwisa 

(braak  *PiSia  arror:  incorrect  access  operation  for  sagaant  type*)))) 

;;  Thia  function  inserts  a  key  in  an  associativa  sat  or  cacha  sagaant. 

(dafun  Insert -Sagaant  (Sagaant  Kay  Naw-Value) 

(casa  (Sagaant -Type  Sagaant) 

(Asaociativa-Sat 

(valuas 

(Naka-Sagaent  :Typa  (Sagaant -lypc  Sagaant) 

:Data  (Hash-Insert  (Sagaant-Data  Sagaant) 

Kay 

Naw-Valua) 

:Siza  (Sagaant-Siza  8a(^nt)) 

Naw-Valua) ) 

(Cacha 

(Insert -Cacha  Kay  Sagaant  Naw-Valua)) 

(otherwise 

(braak  *PiSiB  error:  incorrect  access  operation  for  segaant  type*)))) 

;;  Thia  function  raaovas  a  key  froa  an  associative  set  or  cache  sagaant. 

(deftin  Raaova-Kay-sagaant  (Sagaant  Ray) 

(casa  (Sagaant -Type  Sagaant) 

(Associativa-Sat 

(Maka-Sagaant  :Type  (Sagaant -Type  Sagaant) 

:Data  (Hash-Delete  (Sagaant -Data  Sagaant)  Ray) 

*.Siza  (Sagaent-Size  Sagaant) ) ) 

(Cacha  (RaaMva-Ray-Cacha  Ray  sagaant}) 

(otherwise 

(braak  *PiSia  error:  incorrect  access  operation  for  sagaant  type*)))) 
;;  This  function  clears  an  associative  sat  or  cacha  sagaant. 


(defun  Translata-sagaant-On-Node  (sagaant-10  Node] 
(lat  ((Sagaant  (Hash-Lookup  (Noda-Sagaants  Noda) 
Sagaant-ID) ) ) 

(if  (null  sagaant) 

(braak  *PiSia  arror:  Biasing  sagaant*) 
Sagaant) ) ) 

;;  This  function  creates  a  raad-write  sagaant. 

(dafun  Craata-Raad-Hrlta-Sagaant  (Size) 
(Maka-Segaant  :Siza  Size 

:'iypa  'Raad-write 
;Data  (Mka-array  Size))) 


(defun  Claar-SatpMnt  (Sagaant) 

(case  (Sagaant-iypa  Sagaant) 

(Associativa-Sat 

(Maka-Sagaant  :iypa  (Sagaant -Type  sagaant) 

:Data  (Claar-Hssh-Tabla  (Sagaant-Data  Sagaant)) 

:Siza  ( sagaant -Size  sagaant) ) ) 

(Cacha 

(Claar-Cacha  Sagaant)) 

(otherwise 

(braak  ’PiSia  arror:  incorrect  access  operation  for  sagaant  type*)))) 


; ;SSSXSSSSSXXSSSSSSSSSSBSSSSSSSSSXS3SSSSSS»XBSSSSSSSS«C*SSSSSSS 

; ;  Caches 


;;  This  function  creates  an  associativa  sat  sagaant. 

(dafun  Craata-Associativa-Sat-Sagaant  (Size) 
(Maka-Sagaant  :Siza  Size 

:Typa  'Associativa-Sat 
:Data  (Maka-Hash-Tabla  size))) 

;;  This  function  creates  a  cacha  sagaant. 

(dafun  Craata-Cacha-sagaant  (Size) 

(Maka-Sagaant  :8iza  Size 

:Typa  'Cache 

:Data  (aaka-array  Size) ) ) 

Thia  function  reads  a  read-«nrlta  sagaant . 


;;  In  PiSim,  caches  are  isplaaantad  as  direct  mapped  arrays.  A  hash 
;;  function  cosputas  an  in^x  into  an  array.  Array  entries  are  cons 
;;  calls  are  of  tha  format:  (Ray  .  Value). 

;;  This  is  tha  hash  function  for  caches. 

(dafun  Cache-Hash  (Kay  size) 

(when  (nuabarp  Ray) 

(satq  Ray  (format  nil  **>a*  Kay)]} 

(lat*  ((String  (string  Kay)) 

(Character  nil) 

(Value  0} 

(Indax  0) 

(Bnd-Indax  (array-total-siza  String))) 

(Cacha-Hash-1  String  Character  Value  size  Index  Bnd-lndax) ) ) 


(dafun  Raad-Sagaant  (Sagaant  Offset) 

(unless  (equal  (Sagaant-Typa  Sagaant) 

'Raad-Writa) 

(braak 

*PlSia  arror:  incorrect  access  operation  for  sagaant  type*)) 
(aref  (Sagaant-Data  sagaant)  Offset)) 

;;  This  function  writes  a  raad-write  sagaant . 


(dafun  Cacha-Hash-1  (String  Character  Value  size  Indax  Bnd-lndax) 

(cond  ((not  (<  Indax  Bnd-lndax)) 

(mod  Value  size) } 

(t 

(satq  Character  (araf  String  Indax) ) 

(satq  Value  (♦  (char-int  Chatactar)  value)) 

(satq  Indax  (!♦  Index)) 

(Cacha-Hash-1  String  Character  value  size  Index  Bnd-lndax)))) 


(dafun  Writa-Sagaant  (Sagaant  Offset  Naw-Valua) 

(unless  (equal  ( Sagaant -lypa  Sagaant) 

'Raad-Writa) 

(braak 

*PlSia  arror:  incorrect  access  operation  for  sagaant  type*)) 


This  function  attaapts  to  match  a  key  in  a  hash  table.  If  tha  key 
is  found,  tha  corresponding  value  is  returned,  otherwise.  'Miss  is 
returned . 
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(d«fun  Match*Cach«  (Ray  Sagmant) 

(lat*  ((indax  (Cacha-Haah  Ray  (Sagoiant-Slza  SagmantJ}) 

(Entry  (araf  (Sagnant-Oata  sagmant)  Indax))} 

(If  (and  (not  (aqual  Entry  ‘b^ty)) 

(aqual  (first  Entry)  Ray)) 

(rast  Entry) 

'Miss) ) ) 

;;  This  function  writas  an  antry  in  tha  eacha,  possibly 
;;  ovarwriting  anothar  valua. 

(dafun  Insart-Cacha  (Kay  sagwant  Naw-valua) 

(lat*  ((Valua  (cons  Ray  Naw-Valua)) 

(Naw-Sagmant-Data 

(Copy-Raplaca-Elt  Valua 

(Cacha-Hash  Ray 

(Sagsant-Siza  Sagnancj) 
(Sagmant -Data  Sagmant)))) 

(valuas  (MaRa-Sagmant  :Typa  ( Sagmant -'ZVP*  Sagmant) 

:Data  Naw-Sagmant-Data 
:Siza  (Sagmant -Siza  Sagmant)) 

Valua))) 

;;  This  function  ramovas  a  kay  from  a  cacha.  If  tha  kay  is  not 
;;  prasant,  no  action  is  takan. 

(dafun  Ramova-Kay-Cacha  (Ray  Sagmant) 

(let*  ((Index  (Cache-Hash  Ray  (Sagmant -Size  segment))) 

(Entry  (araf  (Segment-Data  Sagmant)  index))) 

(if  (and  (not  (aqual  Entry  'Bs^ty)) 

(aqual  (first  Entry)  Ray)) 

(valuas 

(Maka-sagmant  :1VP«  (Sagmant -Type  Sagmant) 

:Data  (Copy-Raplaca-Elt  'b^ty 
Index 

(Sagmant -Data 
Sagmant ) ) 

:Siza  (Sagmant -Size  Segment)) 

'Empty) 

(valuas  Sagmant  nil)))] 

;;  This  function  clears  a  cacha. 

(dafun  Claar-Cacha  (Sagmant) 

(lat*  ((Data  (Sagmant -Data  Segment)) 

(Index  0) 

(find-lndax  (array-total -size  Data))) 

(Claar-Cacha-1  Data  index  End-Index  Segment))) 

(dafun  Claar-Cacha-1  (Data  Index  find-lndax  Sagmant) 

(cond  ((not  (<  index  find-lndax)) 

Sagmant) 

(t 

(satq  Data  (Copy-Raplaca-Elt  'EMPTY  Index  Data)) 

(satq  Sagmant  (Naka-sagmant  :'iypa  ( Sagmant -lypM  Sagmant) 
:Data  Data 

:Siza  (Sagmant-Siza  Segment))) 

(satq  Indax  (!♦  Index)) 

(Claar-Cacha-1  Data  Indax  End-lndax  Segment)))) 


Tasks 

;;;  This  returns  tha  node  ID  of  tha  specified  task's  nodes. 

(dafun  Noda-Of  (Task) 

(Noda-ID  (Task-Noda  Task) ) ) 

;;;  This  ratums  tha  time  of  a  task.  This  is  defined  as  tha  node 
;;;  time  for  tha  specified  task. 

(dafun  Tima-Of  (Task) 

(Noda-Tima  (Task-Mode  Task) ) ) 


(dafun  Increment -Tima-Of  (Task  Delta) 

(lac*  ((Task-Node  (Task-Noda  Task)) 

(Naw-Tima  (4^  (Noda-TiM  Task-Noda)  Delta))) 

(aatq  Task-Noda  (Naka-Noda  :Txma  Naw-Tima 

:ID  (Noda-ID  Task-Noda) 

rSagmancs  (Node -Segments  Task-Noda; 

:Nodals  (Noda-Nodals  Task-Noda))) 

(values  Nsw-Tima 
Task-Noda 

(Make-Task  tHandlar  (Task-Handler  Task) 

:Noda  Task-Noda 

:Sagment  (Task-Segment  Task) 

:IP  (Task-lP  Task) 

:Status  (Task-Status  Task))))) 

;;  This  returns  the  handler  type  of  the  task. 

(dafun  Handlar-Nama-Of  (Task) 

(Handlar-Nama  (Task-Handler  Task) ) ) 

;;;  This  function  creates  a  new  task  sagmant  of  tha  specified  length. 

Tha  number  of  arguments  and  massage  length  values  are  compared  with 
;;;  tha  handler  arity  and  aricy  plus  number  of  locals  respectively.  Two 
;;;  is  added  to  tha  aricy  and  number  of  locals  to  account  for  tha  massage 
;;;  length  and  type  information  scored  in  the  segment.  The  segment  is 
;;;  than  initializes  w-i  th  the  supplied  arguments. 

(dafun  Write -Argument 8  (Arguments  Index  New-Segmant) 

(cond  ((null  Arguments) 

Naw-Ssgmant) 

(t 

(multipla-valua-bind  (Naw-Valua  Written-Segment) 

(Wrxta-Sagmant  Naw-Sagmant  Indax  (car  Arguments)) 
(Writa-Argumants  (cdr  Arguments) 

(1*  Index) 

Wnttan-Sagmant) ) ) ) } 

(dafun  Craata-Task-Sagmant  (Length  Task-Type  Arguments  Handler) 

(lat  ( (Naw-Sagmant  (Craata-Raad-Writa-Sagmant  Length))) 

(whan  (not  (=  (Handlar-Arlty  Handler) 

(length  Arguments))) 

(brsa)c  *PiSim  error:  arity  mismatch*)) 

(%dien  (not  (s  Length  (•*>  (Handlar-Arlty  Handler) 

(Handler-Number -Of -Locals  Handler) 

2))) 

(break  *Pi5im  error:  length/  handler  storage  mismatch*)) 

( Ma  ke -Ta  8)c -Sagmant 
:Storaga-Rqpnt8  Length 
:Typa  Task-Type 

•Arguments  (Writa-Argumants  Arguments  2  Naw-Sagmant) ) ) ) 

lliis  function  creates  a  new  task  for  a  massage.  Tha  handler  and 
;;;  node  are  determined.  A  new  sagmant  is  created  and  initialized. 

After  tha  new  task  is  created,  its  sagmant  is  added  to  tha  task's 
;;;  node.  Finally  the  new  task  is  returned. 

(dafun  Craata-Task  (Massage) 

(lat*  ((Handler  (Gat -Handler  (Nasaaga-Typa  Massage) ) ) 

(Node  (Translata-Noda  (Massage-Destination  Massage)))) 

(Make-Task  :Handlar  Handler 
:Noda  Node) ) ) 

;;;  This  function  executes  a  cask.  It  executes  instructions  which 
;;;  change  a  task's  status.  If  tha  status  is  'Running,  anothar 
;;;  instruction  is  executed. 

(dafun  Execute-Task  (Task) 

(multipla-valua-bind  (Valua  Naw-Task) 

(Exacuta-Naxt-Instruction  Task) 

(satq  Task  Nsw-Task)) 

(if  (aqual  (Task-Status  Task)  'Running) 

(Execute-Task  Task))) 


;;;  This  sets  the  time  of  the  specified  task  (i.a.  the  time  of 
;;;  tha  node  of  the  specified  task). 


Events 


(dafun  Sat-Tima-Of  (Task  Naw-Tima) 

(lat  ((Teak-Node  (Task-Noda  Task) ) ) 

(satq  Task-Noda  (Maka-Noda  :rima  Naw-Tima 

:1D  (Noda-ZD  Task-Noda) 

:Sagmants  (Noda-Sagmants  Task-Noda) 
:NodalB  (Noda-Nodals  Task-Node))) 

(valuas  Naw-Tima 
Task-Noda 

(Make-Task  :Handlar  (Task-Handler  Task) 

:Noda  Task-Noda 

tSagmant  (Task-Segment  Task) 

:IP  (Task-IP  Task) 

iStatus  (Task-Status  Task))))) 


;;  This  function  enqueues  an  event  in  tha  global  event  queue. 

;;  Events  are  enqueued  in  order  on  increasing  event  time. 

;;  **  Note  that  whan  2  events  have  tha  same  time,  tha  one  sent  to 
;;  Bnquaua-Bvant  first  has  higher  priority. 

(dafun  Bnquaua-Evant  (Naw-Bvant) 

(if  (or  (null  *Evant -Queue*) 

(<  (Event-Time  Naw-Evant) 

(Event-Time  (first  *Evant -Queue*) )} } 

(satq  *Evant -Queue* 

(cons  Naw-Bvanc  *Bvant-Ouaua*) ) 

(satq  *Bvant -Queue* 

(Insert-Event  Naw-Evant  *Bvant-Quaua*) ) ) ) 


This  increments  tha  task  time  by  tha  specified  delta. 


;;  This  function  is  used  to  enqueue  events  inside  tha  event  queue. 
;;  It  is  part  of  a  recursive,  priority  queue  insert  algorithm. 
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(d*fun  InMrt>Bv*nt  (NM#>Bv«nt  Bv*nt-Ou«u«) 

(if  (or  (null  (rost  Bvont*Ou«u«) } 

(<  (Bv«nt*TiM  Now^Bvont) 

(Bv«nt«TiM  (socond  Bvont^Quouo) ) ) ) 

(cons  (c«r  Bvsnt-Quous) 

(cons  Nsw-Bvsnt  (rsst  Bvsnt^Ousus) ) ) 

(cons  (csr  Bvsnt-Ousus) 

(Inssrt'Bvsnt  Nsw-Bvsnt  (rsst  Bvsnt-Qusus) ) ) ) ) 

;;  This  function  dsqususs  snd  returns  a  event  froa  the  global 
;;  event  queue.  If  the  queue  is  ei^ty,  nil  is  returned. 

(defun  Dequeue-Bvent  (} 

(let  ((Event  (car  *Event-Queuc*) ) ) 

(setq  *Bvent ‘Queue*  (edr  *Event‘Queue*n 
Event ) ) 

;;  nils  function  clears  the  event  queue. 

(defun  Clear-Event ‘Queue  (} 

(setq  *Event -Queue*  nil}) 

;;  This  function  dequeues  and  executes  the  next  event  in  the 
;;  event  queue.  If  the  event  is  a  aesaage,  a  new  task  is 
;;  created.  The  node  tiae  is  adjusted  if  the  event  tiae  is 
;;  later  than  node  tiae.  If  a  event  is  executed,  t  is  returned. 

(defun  Bxecute-Next-Event  () 

(let*  ((Event  (Dequeue-Bvent) ) 

Task) 

(setq  Task  (Create-Task  (Event -Object  Event))) 
(aultiple-value-bind  (New-Tiae  Task-Node  New-Task) 
(Set-Tiae-Of  Task 

(if  (>  (Event -Tiae  Event) 

(Tiae-Of  Task) ) 

(Event -Tiae  Event) 

(Tiae-Of  Task) ) ) 

(setq  *Nodes* 

(Copy-Replace-Node 

Task-Node 

(Translate-Node 

(Message-Destination  (Event -Object  Event))) 
*Nodes*)) 

(setq  Task  New-Task) ) 

(let*  ((Message  (Event -<Aject  Bvent)) 

(Node  (Translate-Node  (Message-Destination  Message))) 
(New-segment  (Create-Task-Segaent 

(Message-Length  Message) 

(Message-Typa  Message) 

(Hessege-Arguaents  Message) 

(Task-Handler  Task) ) ) ) 

(aultiple-value-bind  (New-Segaent«ID  New-Node) 

(Add-Segaent  New-Segaent  Node) 

(setq  Node  New-Node) 

(setq  *Nodes*  (Copy-Replace-Node 
Node 

(Message-Destination  Message) 

*Nodes*)) 

(setq  Task  (Make-Task  :Handler  (Task-Handler  Task) 

:Node  Node 

; Segasnt  New-8egaent-I0 
:IP  (Task-lP  Task) 

:Status  (Task-Status  Task}}))} 

(Debug-Print  1 

* (start:  task  -a  node  -d  tiae  -d  old  status  -al-a* 
(Handler-Naae-Of  Task)  (Node-Of  Task) 

(Tiae-Of  Task)  (Task-Status  Task) ) 

(Log-Task  Task) 

(setq  Task 

(Make-Task  :Handler  (Task-Handler  Task) 

:Node  (Task-Node  Task) 

:Segaent  (Task-Segaent  Task) 

:IP  (Task-IP  Task) 

:Status  'Running)) 

(Adjust -Concurrency-List  (Tiae-Of  Task)  1} 

(Execute-Task  Task) 

(Adjust -Concurrency-List  (Tiae-Of  Task)  -1) 

(Debug-Print  1  "(stop:  task  -a  nods  -d  tiae  -d  status  -al-a* 
(Handler-Naae-Of  Task)  (Node-Of  Task) 

(Tiae-Of  Task)  (Task-Status  Task) ) ) ) 


Handlers 

;;  This  predicate  tests  if  a  stateaent  is  an  instruction. 

(defun  Label?  (Stateaent) 

(syabolp  stateaent}) 

;;  This  predicate  tests  if  a  stateaent  is  an  instruction. 

(defun  Instruction?  (Stateaent) 

(listp  Stateaent)) 


;;  This  function  inserts  a  binding  into  a  handler's  bindings.  If  the 
;;  specified  handler  is  'Global,  the  binding  is  inserted  in  the  global 
;;  bindings. 

(defun  Insert -Binding  (Nasa  Value  Handler) 

(cond  ((equal  Handler  'Global) 

(satq  *Global-Bindings* 

(Hash-Insert  *Global -Bindings*  N«m  Value)) 

(values  Value  Handler) ) 

(t 

(setq  Handler 

(Make-Handler  :Naae  (llanrllsr  tlias  Handler) 

: Instructions  (Handler-Instructions  Handler) 
:Arity  (Handler-Arity  Handler) 

: Nuabe r -o f - Loca 1 s 
(Handler-Nuaber-of-Locals  Handler) 

:Bindings 

(Hash-Insert  (Handler-Bindings  Handler) 

Naae 

Value) ) ) 

(values  Value  Handler)))) 

;;  This  function  looks  up  the  binding  of  a  ^«bol  in  the  handler.  If 
;;  it  is  not  found  there,  the  global  bindings  are  checked. 

(defun  Lookup-Binding  (Naae  Handler) 

(or  (Hash-Lookup  (Handler-Bindings  Handler)  Naae) 

(Hash-Lookup  *G1o]m1 -Bindings*  Naae))) 

;;  This  function  returns  the  nuaber  of  instructions  in  a  handler. 

(defun  Nuaber-Of-Instructions  (Handler) 

(array-total-size  (Handler-Instructions  Handler))) 

;;  This  function  returns  the  handler  object  for  the  handler  naae.  If 
;;  the  handler  does  not  exist,  an  error  aessage  is  printed. 

(defun  Get -Handler  (Naae) 

(let  ((Handler  (get  Naae  'Handler))) 

(if  (null  Handler) 

(break  *Pi5iB  error:  unknown  handler*) 

Handler) ) ) 

;;  This  function  determines  the  nuaber  of  instructions  in  a  sequence 
;;  of  statMeents  and  builds  a  instruction  array  of  the  correct  size. 

;;  It  then  reads  each  stateaent.  If  it  is  an  instruction,  it  is 
;;  inserted  into  the  array.  If  it  is  a  label,  the  label  and 
;;  stateaent  index  is  inserted  into  the  handler's  bindings. 

(defun  Make-Instructions  (Statements  Handler) 

(let  (Instructions) 

(let  ( (Tesp-Stats  Statements) 

(Statement  nil) 

(Nuaber-Of-Stateaents  0)) 

(setq  Instructions 

(Make-Instructions-1  Instructions  Tei^-Stats  statement 
Nuaber-of-Stetemente) ) ) 

(let  ((index  0) 

(Statement  nil) 

(Tes^-Stats  Statements)) 

(aultiple-velue-bind  (Instructions  New-Handier) 

(Make-Instructions-2  Instructions  Teap-Stats  Stateaent 
Index  Handler) 

(setq  Handler  New-Handier) ) 

(setq  Handler 

(Make-Handler  :NaBe  (Handler-Naae  Handler) 

:lnstructions  Instructions 
:Arity  (Handler-Arity  Handler) 

:NuBber-of -Locals  (Handler-Nus^r-of -Locals 
Handler) 

:Bindings  (Handler-Bindings  Handler))) 

(values  Instructions  Handler}}) 

(defun  Nake-Instructions-1  (Instructions  Teap-Stats  Stateswnt 
Nuaber-of -Statements) 

(cond  ((null  Teap-Stats) 

(setq  Instructions  (make-array  Nuaber-Of-Stateaents) ) ) 

(t 

(setq  Statement  (car  Teap-Stats)) 

(setq  Teap-Stats  (odr  Teap-stats) ) 

(cond  ((not  (Label?  Statement)) 

(if  Statement 

(setq  Nuaber-Of-StatesMnts 

(!’•>  Nuaber-Of-Stateaents))))) 

(Make-lnstructions-1  Instructions  Teap-stats  stateaent 
Nui±»er-of-Stateaents) ) ) ) 

(defun  Make-Instructions-2  (Instructions  Tsip-Stats  statement  Index  Handler) 
(cond  ((null  Teap-stats) 

(values  Instructions  Handler) ) 

(t  (setq  Statement  (car  Teap-stats)) 

(setq  Teap-stats  (odr  Teap-stats)) 

(cond  ((Label?  Stateaent) 

(aultiple-value-bind  (Value  New-Handier) 
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(Insert 'Binding  Statement  Index  Handler) 
(aetq  Handler  New-Handier))) 

((Instruction?  Statement) 

(progn 

(setq  Instructions 

I Copy -Repl ace -El t 
Statement  Index  Instructions}) 

(setq  Index  (1-f  Index)})}) 
(Ma)ie'Instruction8'2 

Instructions  Tes^x-Stmts  Statement  Index  Handler)})) 


;  This  function  indexes  the  parameters  and  locals  in  a  handler. 
;  This  includes  assigning  a  each  parameter  and  value  an  index 
;  in  the  handler  segment.  These  assignments  are  included  in 
;  the  handler's  bindings.  The  arity  and  number  of  locals 
;  parameters  are  also  set. 


(defun  Index'Parameters'And'Locals  (Parameters  Locals  Handler) 
(let  ((Parameter  nil) 

(Temp-Parameters  Parameters) 

(Index  2)) 

(setq  Handler 

(index-Parameters-And-Locals-l 
Parameter  Tes^-Parameters 
Index  Handler) ) ) 

(let  ((Local  nil) 

(Temp-Locals  Locals) 

(Index  {*  (length.  Parameters)  2))) 

(setq  Handler 

(Index-Paramecer8-And-Locals-2  Local  Tesqp-Locals  Index 

Handler) ) } 

(setq  Handler  (Ha)ce-Handler  :Name  (Handler-^7aAe  Handler) 

: Instructions  (Handler-Instructions 
Handler) 

:Arity  (length  Parameters) 
:Number-of-Local8 
(Handler-Number-of-Locals  Handler) 

: Bindings  (Handler-Bindings 
Handler) } ) 

(setq  Handler  (Ma)ce-Handler  :Name  (Handler-Name  Handler) 

: Instructions  (Handler-Instructions 
Handler) 

:Arity  (Handler-Arity  Handler) 
:Number-of-Locala  (length  Locals) 
:Bindings  (Handler-Bindings 
Handler) ) ) 


Handler) 


(defun  Index-Parameters-And-Locals-l  (Parameter  Temp-Parameters 

Index  Handler) 

(cond  ((null  Temp-Perameters)  Handler) 

(t 

(setq  Parameter  (car  Temp-Parameters)) 

(setq  Temp-Parameters  (cdr  Temp-Parameters)) 
(nultiple-value-bind  (Value  New-Handier) 

(Insert -Binding  Parameter  index  Handler) 

(setq  Handler  New-Handier)) 

(setq  Index  (1+  Index)) 

(Index-Parameters-And-Locals-1  Parameter  Temp-Parameters 

Index  Handler) } ) ) 


(defun  Index-Parameter8-And-Locals-2  (Local  Temp-Locals 

Index  Handler) 

(cond  ((null  Tes^-Locals)  Handler) 

(t 

(setq  Local  (car  Tes^-Locals) ) 

(setq  Tes^-Locals  (cdr  Tesp-Locals) ) 
(multiple-value-bind  (Value  New-Handier) 

(Insert -Binding  Local  Index  Handler) 

(setq  Handler  New-Handier) ) 

(setq  Index  (!♦  index}) 

(Index-Parameters-And-Locals-2  Local  Temp-Locals  index 

Handler) ) ) ) 

;;  This  function  reads  a  handler  frMi  an  expression.  The 
;;  resultant  handler  is  stored  on  the  property  list  of  the 
;;  handler  name. 


(defun  Read-Handler  (Expression) 

(let  ((Name  (first  Expression)) 

(Parameters  (second  Expression)) 

(Locals  (third  Expression)) 

(Statements  (nthcdr  3  Expression)) 

(New-Handier  (Malce-Handler) ) ) 

(setq  New-Handier 

(Ma)te-Handler  :Name  Name 

: Instructions  (Handler-instructions 
New-Handier) 

:Arity  (Handler-Arity  New-Handier) 

: Number - o f - Loca 1 s 

(Handler-Number-of-Locals  New-Handier) 
:Bindings  (Handler-Bindings  New-Handier))) 

(setq  New-Handier 


(Index-Parameters-And-Locals  Parameters  Locals  New-Handier] ) 
(multiple-value-bind  (Instructions  Newer-Handier) 
(Ma)ce-lnstruction8  Statements  New-Handier) 

(setq  New-Handier  Newer-Handier) ) 

(setq  *Global-Pli8t* 

(Update-Plist  Name  'Handler  New-Handier)))) 

;;  This  allows  th<?  definition  of  handlers.  This  should  be  part 
; ;  of  a  more  general  reader. 

(defun  Define-Handler  (fcrest  Expression) 

(Debug-Print  0  '-Rloading  handler  -'a-fc*  (first  Expression)) 

(Read-Handler  Expression) 

nil) 


;;  Nodals 

;;  'Ihis  allows  the  definition  of  nodals  (node  variables).  An 
;;  index  is  assigned  (using  the  number  of  existing  nodals).  A  new 
;;  global  binding  is  added. 

(defun  Define-Nodal  (Name) 

(Debug-Print  0  '‘'^defining  nodal  ~a-fc*  Name) 

(rond  ((not  (null  (Ha8h-Loo)cup  *61obal-Binding8*  Name])] 

(format  t  ‘-RWaming:  -a  has  alrea^  been  defined  globally-^* 
Netme) ) 

(t 

(multiple- value-bind  (Value  Handler) 

(Insert-Binding  Name  *Nodal-Count*  'Glol>al}} 

(setq  *Nodal-count*  (!«-  *Nodal-Count* ) ) ) )  ) 


;;  Constants 

;;  l^is  allows  the  definition  of  global  constants.  The  binding 
;;  is  added  to  the  global  bindings. 

(defun  Define-Constant  (Name  Value) 

(Debug-Print  0  '-Rdefining  constant  -a-ft*  Name) 

(multiple- value-bind  (Value  Handler) 

(Insert-Binding  Name  Value  'Global)}) 

;  ;«8sxsssssssssssxaEssssssssssssss«scssssssessss£B9saess=sasss==s 

;;  Instructions 

;;  This  function  returns  the  next  instruction  of  the  handler  to  l&e 
;;  executed.  The  current  instruction  pointer  (IP)  is  obtained  from 
;;  the  tas)c.  The  instructions  are  obtained  from  the  handler.  The 
;;  tas)c  instruction  pointer  is  incremented.  Note:  t  te  instruction 
;;  pointer  is  incremented  AFTER  the  next  instruction  is  fetched. 

(defun  Next-Instruction  (Tas)c) 

(let  ((IP  (Tas)c-IP  Ta8)c) ) ) 

(when  l>B  IP 

(Number-Of-Instructions  (Tas)c-Handler  Tasic) ) ) 

(brea)c  "PiSim  error:  IP  out  of  range*)) 

(setq  TssK  (NaKe-Tas)(  :Handler  (Task-Handler  Task) 

:Node  (Task-Node  Task) 

:$egment  (Task-Segment  Task) 

:IP  (1-f  (Tabk-lP  Task) ) 
rStatus  (Task 'Status  Task))} 

(values  (aref  (Handler-Instructions  vTesk-Handler  Task)) 

IP) 

Task) ) ) 

;;  This  function  executes  a  single  instructions.  It  first  locates  the 
;;  next  instruction  using  the  task  instruction  pointer.  The 
;;  instruction  pointer  is  incremented.  Then  it  applies  the  operation 
;;  to  the  ergiiments. 

(defun  Bxecute-Next-Instruction  (Active-Task) 

(multiple-value-bind  (Instruction  New-Taak) 

(Next-Instruction  Active-Tas)c) 

(setq  Active-Task  New-Task) 

(Debug-Print  2  *  (executing  inacruction  -a]-.&* 

(Inetruction-Op  instruction]) 

(Log-Instruction  Instruction) 

(multiple-value-bind  (Value  New-Task) 

(Apply-Operation  (Instruction-Op  Instruction) 

Active-Task 

( Instruct ion-Args  instruction)} 

(sstq  Active-Task  New-Task) 

(values  value  Active-Task)))) 


;;  Operations 

;;  This  function  applies  a  processor  operation  to  a  list  of  arguments. 
;;  Each  argument  is  evaluated  loefore  the  operation  is  applied.  Ihe 
;;  apply  only  takes  place  if  the  task  status  is  'RtMiING. 

(defun  Apply-Operation  (Operation  Active-Task  Arguments) 
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(■ultipl*-v«lu«~bind  (ArguMitt^Liat  NMi-Modts 
N«w-Ta«k  N«w-Bv«nt*Ouau«) 
(Bvaluata-Argunantt  ArguMnts  Activa-Taak) 

(aatq  *Noda8*  New-Nodaa 
Activa-Taak  Naw-Taak 
*Bvant-Quaua*  Naw-Bvant-Quaua) 

(cond  ( (aqual  (Taak-Statua  Activa-Taak) 

'RUNNING) 

(Log-Oparation  Oparation) 

(■ultipla-valua-bind  (Raault  Naw-Nodaa  Naw-Taak 
New- Event -Ouaua ) 

(apply  (Gat -Oparation  Operation) 
Argunent-Liat 
•Nodes* 

Active-Task 

•Event-Queue*) 

(aetq  Active-Task  New-Taak 
•Nodes*  New-Nodes 
•Event -Queue*  New-Evant -Queue) 

(values  Result  Active-Task))) 

(t  (values  nil  Active-Task))))) 


(aetq  *6lobal-Pliat* 

' (Update-Pliat  ,Nmm  'Operation  •'(lambda  .BReat)))) 
Debugging 

; ;  This  prints  debug  messages  depending  on  the  debug  level • 

(defmacro  Debug-Print  (Level  Format  Brest  Arguaienta} 

'  (%d\en  (<s  ,  Level  *Debug-Level*) 

(format  t  , Format  , 9 Arguments) ) ) 

;;  lliis  function  sets  the  debug  level. 

(defun  Set -Debug-Level  (New-Level) 

(aetq  *Debug-Level*  New-Level)) 


; :  Logging 

;;  This  predicate  starts  a  new  log,  saving  the  current  log. 


(defun  Evaluate-Arguments  (Arguments  Active-Task) 
(let  ((Argximent  nil)) 

(Evaluate-Arguments-l  Argument  Arguments  *Nodes* 
Active-Task  *Event-Queue*) ) ) 


(defun  Evaluate-Arguments-l  (Argument  Arguments  Nodes 

Active-Task  Event -Queue) 


(cond  ((null  Arguments) 

(values  nil  Nodes  Active-Task  Event -Queue) ) 

(t 

(setq  Argument  (car  Arguments)) 

(aetq  Arguments  (cdr  Arguments)) 

(multiple-value-bind  (Value  New-Nodes  New-Task 
New-Event -Queue ) 

(Evaluate  Active-Task  Argument) 

(multiple-value-bind  (Argument-List  Newer-Nodes 

Newer-Task  Nevier-Event -Queue) 
(Evaluate-Arguments-l  Argument  Argusients  New-Nodes 
New-Task  New-Event -Queue) 
(values  (eons  Value  Argument-List) 

Ne%rar-Nodes  Ne%Mr-Task  Ne%«er-Event -Queue) })}) ) 


;;  This  function  evaluates  the  expression  and  returns  the 
;;  results.  This  is  an  evaluator  appropriate  for  the  limited 
;;  expressions  in  a  Pi  programs.  Expressions  are  only  evaluated 
If  the  task  status  is  'RUNNING.  The  following  expression 
;;  types  are  possible: 

;;  A  number  or  string  returns  the  value  of  the  number  or  string. 

;;  A  symbol  is  looked  up  in  the  handler  bindings.  If  it  is 
;;  present,  the  corresponding  value  is  returned.  Otherwise,  the 
;;  symbol  is  returned. 

; ;  A  nested  expression  (a  list)  in  the  form  (symbol  argl  arg2..). 
;;  In  this  case,  Apply-Operation  is  recursively  called. 

(defun  Evaluate  (Active-Task  Expression) 

(td)en  (equal  (Task-Status  Active-Task) 

'RUtWING) 

(values 

(typecase  Expression 
((or  nuaiber  string) 

Expression) 

(symbol 

(or  (Lookup-Binding  Expression  (Task-Handler  Active-Task)) 
Expression) ) 

(list 

(multiple-value-bind  (Value  New-Task) 

(Apply-Operation  (Instruct ion-Op  Expression) 
Active-Task 

(Instruction-Args  Expression)) 

(setq  Active-Task  New-Task) 

Value) ) 

(otherwi se 

(break  *PiSim  error:  unknown  expression*))) 

Active-Task) } ) 


; ;  This  function  returns  the  operation  function  for  the  operation 
; ;  name.  If  the  operation  does  not  exist,  an  error  message  is 
;;  printed. 

(defun  Get-Operation  (Name) 

(let  ((Operation  (get  Name  'Operation))) 

(if  (null  Operation) 

(break  *Pi$im  error:  unknown  operation*) 

Operation) ) ) 

;;  This  is  used  to  define  processor  operations. 

(defmacro  Oef ine-Operation  (Name  Brest  Rest) 


(defun  Start -New-Log  () 

(setq  *Log*  (Make-Log  :Type  (Log-Type  *Log*) 

:0ld-Log8  *Log*))) 

;;  'niia  is  used  in  a  counting  profile.  The  category  count  is 
;;  incremented,  or  created,  if  non-existent. 

(defun  Collect-Profile  (Category  Profile) 

(cond  ((Hash-Lookup  Profile  Category) 

(let  ( (New-Value  (1*^  (Hash-Lookup  Profile  Category)))) 

(setq  Profile 

(Hash-Insert  Profile  Category  New-Value)) 

(values  New-Value  Profile))) 

(t 

(values  1  (Hash-Insert  Profile  Category  1))))} 

;;  *rhis  predicate  testa  if  logging  is^enabled.  If  the  log  is  nil, 

;;  logging  is  on. 

(defun  Logging?  () 

(not  (or  (null  •log*) 

(equal  (Log-Type  *Log*)  'None)))) 

;;  This  function  logs  the  specified  task.  Presently,  profiles  of  task 
;;  types  and  status'  are  maintained. 

(defun  Log-Task  (Task) 

(when  (Logging?) 

(multiple-value-bind  (New-Value  New-Profile) 

(Collect -Profile  (Task-Status  Task) 

(Log-Task-Status-Profile  *Log*)) 

(setq  *Log* 

(Nake'Log  :1VP*  (Log-Type  *Log*) 

:Task-status-Profile  New-Profile 
:Task-Type-Profile  (Log-Task-Type-Profile  *Log*) 

: I nst ruct i on -Type- Pro f i le 
(Log-Instruction-iype-Profile  •Log*) 

: Ope r a t i on -TVP* • Pro f i le 
(Log-Operation-Type-Profile  *Log*) 

: Concurrency-List  (Log-Concurrency-List  *Log*) 
:Old-Logs  (Log-Old-Logs  *Log*))) 

(when  (equal  (Task-Status  Task)  'New) 

(multiple-value-bind  (New-value  New-Profile) 

(Collect -Profile  (Handler-Name-Of  Task) 

(Log-Task-iype-Profile  *Log*)) 

(setq  *Log* 

(Make-Log  rlVP*  (Log-iype  *Log*) 

:Task-Statua-Profile 
(Log-Task-Status-Profile  *Log*) 
;Ta«k-Type-Profile  New-Profile 
: I n • t ru " t i on -Type - Pr o f i 1 e 
(Log-Instruction-Type-Profile  *Log*) 

: Ope r a t i on - Type - Pr o f 1 1 e 
(Log-Operation-Type-Prof ile  *Log*) 
:Concurrency-List  (uog-concurrency-List  *Log*) 
:Old-Logs  (Log-Old-Logs  *Log* ))))))) ) 

;;  'This  function  collects  statistics  on  instruction  types. 

(defun  Log-Instruction  (Instruction) 

(when  (Logging?) 

(cond  ((not  (equal  (first  Instruction)  'Write)) 

(multiple-value-bind  (New-Value  New-Profile) 

(Collect-Profile  (first  Instruction) 

(Log-Instruction-Type-Profile  *Log*)] 

(setq  *Log* 

(Make-Log  :Type  (Log-Type  *Log*) 

:Task-Status-Profile 
(Log-Task-Ststus-Profile  *Log*) 
:Task-Type-Profile  (Log-Task-Type-Profile 
•Log*) 
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; Instruction-Type-Profile  New-Prof ile 
: Ope r a t i on -Type - Pr o f i 1 e 
(Log-Operation-Type-Profile  *Log*) 

; Concurrency -Li st 
(Log-Concurrency-List  ‘Log*) 

:Old-Logs  (Log-Old-Logs  *Log*))))) 
((not  (listp  (fourth  Instruction))) 

(multiple- value-bind  (Hew-Value  New-Prof ile) 
(Collect-Profile  'Initialize 

( Log- I nst ruct ion-Type - Pro f i le 
*Log*) ) 

(««^  -  ,  *Log* 

(Malce-Log  :Type  (Log-Type  *Log*) 
:Ta8)c-Status-Prof  ile 
(Log-Tas)c-Status-Profile  *Log*) 

:  Ta  8)i -Type  -  Pro  f  i  1  e 
(Log-Task-Type-Profile  *Log*) 

: Instruction-Type-Profile  New-Prof ile 
! operat ion -Type-Prof i le 
(Log-operation-Type-Prof ile  ‘Log*) 
:Concurrency-Li8t 
(Log-Concurrency-List  *Log*) 

:Old-Logs  (Log-Old-Logs  *Log*))))) 
((equal  (first  (fourth  Instruction)}  'Read) 
(multiple-value-bind  (New-Value  New-Profile) 

(Collect -Prof ile 

'Move  (Log-Instruction-Type-Profile  *Log*)) 

(setq  *Log* 

(Make-Log  :Type  (Log-Type  *Log*) 
:Ta8k-StatU8-Prof ile 
(Log-Task-Status-Profile  *Log*) 
:Task-Type-Prof ile 
(X^-Task-lVpe-Profile  *Log*) 
:lnstruction-Type-Profile  New-Profile 
:  Operat  i  on -'Type-Pro  f  i  le 
(Log-Operation-Type-Profile  *Log*) 
:Concurrency-Li8t 
(Log-Concurrency-List  *Log*) 

:Old-Log8  (Log-Old-Logs  *Log*))))) 

(t 

(multiple-value-bind  (Mew-value  Mew-Prof ile) 

(Collect-Profile  (first  (fourth  Instruction)) 
(Log-Instruction-Type-Profile 
*Log*)) 

(setq  •Log* 

(Make-Log  :'rype  (Log-Type  ‘Log*) 
:Task-Status-Profile 
(Log-Task-Status-Profile  *Log*) 
:Task-Type-Profile 
(Log-Task-Type-Profile  ‘Log*) 
:lnstruction-Type-Profile  New-Profile 
: Opera t i on -Type - Pro f i 1 e 
(Log-Operation -Type -Prof ile  ‘Log*) 
:Concurrency-Li8t 
(Log-Concurrency-List  *Log*) 
sOld-Logs  (Log-Old-Logs  *Log*) )})))) ) 

;;  This  function  creates  an  operation  profile. 

(defun  Log-operation  (Operation) 

(when  (Logging?) 

(nultiple-value-bind  (New-Value  New-Profile) 

(Collect-Profile  Operation 

(Log-Operation-Type-Profile  *Log*)) 

(setq  *Log* 

(Make-Log  :Type  (Log-Type  *Log*] 

:Task-Status-Profile 
(Log-Task-Status-Prof ile  •Log*) 
j Ta sk -Type - Pro f i 1 e 
(Log-Task-Type-Profile  *Log*) 

: Instruct ion-iype-Pro file  New-Profile 
:  Operat  i  on  -'Type  -  Pro  file 
(Log-Operation-Type-Profile  •Log*) 

: Concurrency-List 
(Log-Concurrency-List  •Log*) 

:Old-Loge  (Log-Old-Logs  •Log*)))))) 

;;  This  function  searches  do%m  a  sorted  list  of  deltas  looking 
;;  for  an  entry  at  a  specified  time.  If  such  an  entry  is  found, 
;;  its  value  is  adjusted  by  Chang*.  If  no  such  value  is  found, 
;;  a  new  delta  is  created  an  inserted  at  the  correct  position  in 
; ;  the  list. 


(Log-Task-Status-Profile  *Log*j 
:Task-Type-Prof ile 
(Log-Task-Type-Profile  "Log*) 

: Instruction-Type-Profile 
(Log-lnatruction-Type-Prof lie  ‘Log*) 

: Ope ra c i o n - Type - Pro f i 1 e 
(Log-^peration-Type-Prof ile  ‘Log*) 

: Concurrency-Li st 
(cons  New-Delta 

(Log-Concurrency-List  *Log*)j 
:Old-Loga  (Log-Old-Logs  *Log*))} 

Mew-Delta) } 

((=  Tine  (Delta-Time  (first  Concurrency-List))) 

(let*  ( (First -Delta  (first  Concurrency-List)) 

(New-Delta 

(Make-Delta  :Tine  (Delta-Time  First-Delta) 

:Value  (+  (Delta-Value  First-Delta) 
Change ) ) ) ) 


(setq  •Log* 

(Make-Log  :Type  (Log-Type  "Log*) 

:Ta8k-Statu8-Prof i le 
(Log-Task-Status-Profile  ‘Log*) 

: Ta s k -Type - Pro f i 1 e 
(Log-Task-Type-Profile  ‘Log*) 

: Instruct ion -Type-Prof ile 
(Log-Instruction-iype-Prof ile  ’Log*) 
••Operat  ion-iype-Profi  1  e 
(Log-<^ration-Type-Prof ile  *Log*' 
:Concurrency-Li8t 
(cons  New-Delta 

(cdr  (Log-Concurrency-List  *Log*))) 
:01d-Logs  (Log-Old-Logs  *Log*))} 
(Delta-Value  New-Delta)}) 


(t 

(setq  *Log* 

(Make-Log  :Type  (Log-Type  *Log*) 

.‘Task -Status-Prof  ile 
(Log-Task-Status-Profile  *Log*) 
:Task-Type-Prof ile 
(Log-Task-Type-Profile  *Log*) 

: Instruction-Type-Profi le 
(Log-Instruction-Type-Profile  *Log*) 
s Operat ion -Type-Pro f i 1 e 
(Log-Operation-Type-Profile  ♦Log*) 
:Concurrency-List 
(Ad j  ust -Rest -Of -Concurrency-Li st 
Time  Change  Concurrency-List) 
:02d-Logs  {Log-Old-’Logs  *Log*l })}))} ) 


;;  Tills  is  the  recursive  part  of  Adjust -Concurrency-List . 


(defun  Adjust -Rest -Of-Concurrency-List  (Time  Change  Concurrency-List) 

(cond  ((or  (null  (rest  Concurrency-List)} 

(<  Time  (Delta-Time  (second  Concurrency-List}))) 

(cons  (car  Concurrency-List) 

(cons  (Make-Delta  :Time  Time  tValue  Change) 

(rest  Concurrency-List)))) 

((=  Time  (Delta-Time  (second  Concurrency-List))) 

(cons  (car  Concurrency-List) 

(cons  (Hake-Delta  ;Time  (Delta-Time 

(second  Concurrency-List)) 

:Value 

(•*■  (Delta-Value  (second  Concurrency-List)) 
Change) ) 

(cdr  (rest  Concurrency-List))))) 

(t 

(cons  (car  Concurrency-List) 

( Ad j  ust -Rest -O  f -Concurrency - Li st 
Time  Change  (rest  Concurrency-List)))))) 

;;  This  function  prints  the  information  from  the  current  log. 


(defun  Print-Log-Information  () 
(when  (or  (equal  (Log-Type  *Log*) 
(equal  (Log-Type  *Log*) 
(Print-Profile-Data) } 

(udien  (or  (equal  (Log-Type  *Log*) 
(equal  (Log-Type  *Log*) 
(Plot -Concurrency) ) ) 


'All) 

'Profile) ) 

'All) 

'Plot)) 
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(defun  Adjust -Concurrency-List  (Time  Change) 

(when  (Logging?) 

(let  ((Concurrency-List  (Log-Concurrency-List  •Log*))) 
(cond  ((or  (null  Concurrsncy-List) 

(<  Time  (Delta-Time  (first  Concurrency-Li st) )) ) 
(let  ((Mew-Delta  (Make-Delta  : Time  Time 

:Value  Change))) 


(setq  *Log* 

(Meke-Log  :Type  (Log-iype  *Log*) 
:Tssk -Status-Prof ile 


;;  This  function  sstimates  the  delivery  delay  of  a  message.  It 
;;  should  be  better  than  it  is  now. 

(defun  Delivery-Delay  (Source  Destination  Length) 

(when  (or  (>«  Source  (Number-Of -Modes } ) 

'  .linusp  Source) 

{>n  Destination  (Mumber-Of -Nodes ) ) 

(minusp  Destination) } 

(break  •PiSim  error:  illegal  node  number*)) 

(«dien  (or  (minusp  Length) 
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(z«rop  Length)) 

(br««k  *PiSiB  •rror:  Mssag*  length*}) 

{l«t  ((DilMncion  nil) 

(TMv>-DiMnsions  *llachin«-DiMnflion«*) 

(Sourc*>CMipon«nts  nil) 

(D«atination-CoMpon«nta  nil)) 

(Dalivary-Oalay-1  Oimanaion  Taaip-Oiaanaiona 

Sourca-Coi^nanta  Paatination-Coa^nanca 
Sourca  Daatinacion  Langth) ) ) 

(dafun  Dalivary-Dalay-l  (Diaanaion  Tanp-Oiaanaiona 
Sourca 'Coaponan t a 
Daatination-CoBvx>nanta 
Sourca  Daatination  Langth) 

(cond  ((null  Taap-Diaanaiona) 

(lat  ( (Sourca -Coaponant  nil) 

(Daatination-Coaponant  nil) 

(Diatanca  0)) 

(Oalivary-Dalay-2 

Sourca-Coaponant 

Daatination-Cooponant  Diatanca  Langth 
Sourca-Coaponanta  Daatination-Coiponanta) ) ) 

(t 

(aatq  Diaanaion  (car  Taap-Diaanaiona)) 

(aatq  Tai^-Diaanaiona  (cdr  Taap-Diaanaiona)} 

(aatq  Sourca-Coaponanta 

(Put-on-End  (aod  Sourca  Diaanaion) 
Sourca-CMponanta) ) 

(aatq  Sourca  (floor  Sourca  Diaanaion)) 

(aatq  Daatination-CMponanta 

(Put-on-End  (aod  Daatination  Diaanaion) 
Daatination-Coi^nanta) ) 

(aatq  Daatination  (floor  Daatination  Diaanaion)) 

(Dali vary -Dalay-1 

Diaanaion  Taap-Diaanaiona  Sourca-Coi^nanta 
Daatination-Coaponanta  Sourca  Daatination 
Langth) ) ) ) 

(dafun  Put-on-End  (X  Liat) 

(cond  ((null  Liat) 

(liat  X)) 

(t  (cona  (car  Liat) 

(Put-on-End  X  (odr  Liat)))))) 

(dafun  Dalivary-Dalay-2  (Sourca-C^^nant  Oaatination-Coaponant 
Diatanca  Langth  Sourca-Coaponanta 
Oaat i nat i on-Coaponant a ) 

(cond  ((null  Sourca-Coaq;>onanta) 

(4-  Diatanca  (-  Langth  1)}) 

(t 

(aatq  Sourca-Coaponant  (car  Sourca-Coaponanta)) 

(aatq  8ourca-':oBponanta  (cdr  Sourca-Coaponanta)) 

(cond  ((null  Daatination-Coaponanta) 

(>4  Diatanca  (-  Langth  1))) 

(t 

(aatq  Daatination-Cnponant 

(car  Daatination-Coaponanta) ) 

(aatq  Daatination-Coaponanta 

(cdr  Daatination-Co^Mnanta) ) 

(aatq  Diatanca 

(♦  (aba  (-  Sourca-Coaponant 

Daatination-Coaponant) ) 

Diatanca) ) 

( Da  1  i  va  ry -Da  1  ay  -  2 

Sourca-Cnponant  Daatination-Coaponant 
Diatanca  Langth  Sourca-Coaponanta 
Daatination-Coaponanta) } ) } ) ) 

;;  Thia  function  injacta  a  atarting  aaaaaga  into  tha  aaehina.  It 
;;  atarta  calculating  tha  aaaaaga  langth  and  daatination.  Itia 
;;  aaaaaga  ia  than  anquauad,  and  avanta  ara  axacutad  until  tha 
;  avant  quaua  i a  aapty . 

(dafun  Injact  (Typa  sraat  Arguaanta) 

(Naka-Hodaa) 

(Claar-Nodaa) 

( C 1 aar - Evan t -Quaua ) 

(lat*  ((Handlar  (Gat-Handlar  lyp*) ) 

(Langth  (♦  (Handlar-Arity  Handlar) 

(Handlar-Nuabar-Of-Locala  Handlar) 

2)) 

(Daatination  (randon  (Nuabar-Of-Nodaa) ) ) 

(Arrival-Tiaa  (Noda-Tina  (Tranalata-Noda  Daatination)) ) 
(Maaaaga  (Malca-liaaaaga  :Oaatination  Daatination 
: Langth  Langth 
iTVP*  1VP* 

tArgupanta  Arguaianta) ) 

(Bvant  (Maka-Bvant  tTiaa  Arrival-Tiaa 
tObjact  Maaaaga))) 

(Enquaua-Bvant  Bvant) 

(Bxacuta-Bvanta) ) ) 


(dafun  Sitacuta-Evanta  () 

(cond  ((null  * Bvant -Quaua*) 

(valuaa  *Bvant -Quaua*  *Nodaa*)) 

(t  IBxacuta-Naxt-Evant) 

(Exacuta-Evanta) ) ) ) 

;;  Haah  Tabla  Punctxona 

(dafconatant  MIN.HASH_TABLE_SIZE  11) 

(dafatruct  Entry 

(Kay  nil  :typa  ayabol) 

(Valua  nil  :typa  any}) 

(dafatruct  HaahTabla 

(Nua-Buckata  nil  :typa  intagar) 

(Nuabar-Bntriaa  nil  :type  intagar) 

(Buckata  nil  :typa  array)) 

(dafun  Haah-Inaart  (Tabla  Ray  Valua) 

(lat*  ((Indax  (Haah-Punction  Kay  (HaahTabla-Hun-Buckata  Tabla)}) 

(Naw-Tabla 

(■ultipla-valua-bind  (Naw-Buckat-Liat  Nunbar-Bntriaa) 
(Splica-In-Buckat  valua 
Kay 

(araf  (HaahTabla-Buckata  Tabla)  Indax) 
(HaahTabla-Nxuibar-Entriaa  Tabla) ) 

(Maka-HaahTabla 

:Nun-Buckata  (HaahTabla-Nua-Buckata  Tabla) 
tBuckata  (Copy-Raplaca-Elt  New-Buckat-Liat 
Indax 

(HaahTabla-Buckata  Tabla) ) 
:NuB)Dar-Bntriaa  Nuabar-Entriaa) ) ) ) 

(if  (>s  (HaahTabla-NuBbar-Entriaa  Naw-Tabla) 

(HaahTabla-Nun-Buckata  Naw-Tabla) ) 

(Haah-Raaiza  Naw-Tabla) 

Naw-Tabla) ) ) 

(dafun  ^lica-In-Buckat  (Valua  Kay  Buckat-Liat  Nuabar-Entriaa) 

(cond  ((or  (null  Buckat-Liat) 

(atring<  Ray  (Entry-Ray  (car  Buckat-Liat)))) 

(valuaa  (cona  (Maka-Entry  :Ray  Kay 

:VBlua  Valua) 

Buckat-Liat) 

(l4>  Numbar-Entriaa) ) ) 

(t  (lat  ( (thia-Entry  (car  Buckat-Liat) ) ) 

(cond  ((atring>  Xay  (Entry-Ray  Thia-Entry) ) 

(foraat  t  *-^Baahing  oldar  buekat  antry  -A.* 

Thia-Entry) 

(valuaa 

;;  if  Kay  9  kay  of  Thia-Entry,  than  ovarwrita  tha  oldar 
;;  buekat  antry.  (Naw  buekat  haa  aawa  Ray  aa  oldar 
;;  Buekat  antry,  but  naw  antry  valua.) 

(cona  (Maka-Entry  :Kay  Ray 

:Valua  Valua) 

(cdr  Buckat-Liat}) 

Nuiddar-Entriaa) ) 

(t  (Bultipla-valua-bind  (Naw-Buckat-Liat  NuB-Entriaa) 
(^lica-In-Buckat  Valua 
Kay 

(odr  Buckat-Liat) 

Nupbar-Bntriaa) 

(valuaa 

(cona  Thia-Entry  Naw-Buckat-Liat) 

NuB-Entriaa))}))})) 

(dafun  Haah-Raaisa  (Tabla) 

(lat*  ( (Old-Buckata  (HaahTabla-Buckata  Tabla) ) 

(Old-Siza  (HaahTabla-Nuai-Buckata  Tabla)) 

(Naw-Siza  (Dataraina-Haah-Tabla-Siza 

(*  (HaahTabla-Nua-Buckata  Tabla)  2))) 

(Naw-Tabla  (Maka-HaahTabla  :NuB-Buckata  Naw-Siza 
;Nuaibar-Bntriaa  0 

:Buckata  (Maka-Haah-Buckata  Naw-Siza)))) 
(Copy-Ovar-Buckata  0  Old-Siza  Old-tackata  Naw-Tabla})} 

(dafun  Copy-Ovar-Buekata  (Indax  Old-Siza  Old-Buckata  Naw-Tabla) 

(cond  ((>■  Indax  Old-Siza)  Naw-Tabla) 

(t  (lat  ((Buckat-Liat  (araf  Old-Buckata  indax))) 

(Copy-Ovar-Buckata  (!♦  Indax) 

Old-Siza 

Old-Buckata 

(Copy-Ovar-Buckat  Buckat-Liat  Naw-Tabla)})))) 
(dafun  Copy-Ovar-Buckat  (Buckat-Liat  Naw-Tabla) 

(cond  ((null  Buckat-Liat)  Naw-Tabla) 

(t  (lat  ((Thia-Bntry  (oar  Buekat-liat) ) ) 

(Copy-Ovar-Buckat  (odr  Buckat-Liat) 

(Haah-Inaart  Naw-Tabla 

(Bntry-Ray  Thia-Bntry) 
(Bntry-Valua  Thia-Bntry) )))))) 


;;Thia  function  eraataa  a  haah  tabla  having  tha  apacifiad  I  of  buckata. 
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;;  Sine*  th«  aiz«  of  a  hash  tabla  Must  ba  a  prima  nxiab^r.  ths 
;;  spacifisd  nunbar  of  bucicats  is  roundad  up  co  a  naar^  priSM. 

; ;  The  naw  tabla  is  than  initializad. 

(dafun  Maka^Hash'Tabla  (fcoptional  Num^Buckats) 

(let  ( (Siza  (E)ataniiina*Hash>Tabla'Siza 

(or  Num-Buckats  M1N.HASK.TABLB.SIZE) ) ) ) 
(Maka-HashTabla  xNusi-Buckats  Siza 

:Bucket8  (Maka^Hash-Buckats  Siza) 
:NuBibar-Entrias  0))) 

;;This  function  craatas  and  initializas  a  buckat  array. 

(dafun  Maka-Hash-Buckats  (Siza) 

(naka^array  Siza)) 

This  function  looks  up  a  kay  in  tha  hash  tabla.  If  it  is 
;;;  found,  the  entry  pointer  is  returned.  Otherwise,  nil  is 
;;;  returned. 

(dafun  Kash'LooJcup  (Tabla  Kay) 

(let*  ((Index  (Hash-Function  Key 

(HashTabla-Mua-Buckets  Table))) 
(Bucket-List  (aref  (HashTabla-Buckats  Tabla) 

Index) ) ) 

(Hash-Lookup-l  Bucket-List  Ray))) 

(dafun  Hash-X«okup-l  (Bucket-List  Ray) 

(cond  ((or  (null  Bucket-List) 

(string<  Ray 

(Entry-Key  (car  Bucket-List)))) 

nil) 

{(Strings  Key 

(Entry-Ray  (car  Bucket-List) ) ) 

(Entry-Value  (car  Bucket-List) ) J 

(t 

(Hash-Lookup-1  (edr  Bucket-List)  Kay)))) 

;;;  This  function  delates  an  entry  in  the  hash  table. 

(daf\m  Hash-Delete  (Tabla  Ray) 

(let  ((Index  (Hash-Function  Key 

(HashTabla-Nua-Buckets  Table)))) 
(Bultlpla-valua-bind  (Hew-Bucket-List  Nunbar-Entries) 

(Splice -Out-Bucket 
Kay 

(aref  (HashTabla-Buckats  Tabla)  Index) 
(KashTabla-NusU>er-Entrias  Table)) 

( Ha ke -HashTabl a 

•NuB-Buckets  (HashTable-NuB-Buckets  Table) 

.•Buckets  (copy-Raplace-Buckat  New-Buckat-List 
Index 

(HashTabla-Buckats  Table)) 
:Nuiibar-Entrias  Nusber-Entrias) ) } ) 

(defun  Splice -Out -Bucket  (Kay  Bucket-List  Nuabar-Bntrias) 

(if  (null  Bucket-List) 

(values  nil  Nusbar-Entries)  ;;  fell  off  end  of  buckat  list 
(let  ( (ITiis-Entry  (car  Bucket-List) ) ) 

(cond  ((string>Rey  (Entry-Ray  Hiis-Bntry) ) 

(■ultiple-value-bind  (Naw-Bucket-List  Nua-Sntries) 
(Splica-Out -Buckat  Kay 

(odr  Bucket-List) 
Nuaber-Entrics) 

(values 

(cons  This-Entry  Naw-Bucket-List) 

Nun-Entries) ) ) 

((Strings  Ray  (Entry-Key  Itiis-Entry) ) 

(values  (edr  Bucket-List) 

(1-  Nunber-Bntries) ) ) 

(t  ;  Key  string<  Kay  of  This-Entry  s>  Key  isn't  found 
(values  nil  Nunber-Entries) ) ) ) ) ) 

;;;  This  function  clears  for  all  entries  in  the  pacified  hash 
;;;  tabla. 

(dafun  Clear-Hash-Tabla  (Tabla) 

(let  ((Size  (HashTabla-NUsi-Buckets  Table))) 

(Make-HashTable  :Nuai-Buckets  Size 
zNueiber-Entries  0 

:Buckets  (Naka-Hash-Buckets  Size)))) 

;;;  Thie  function  picks  tha  first  priaa  nunbar  greater  than  or 
;;;  equal  to  tha  pacified  siza  astiaata.  Tha  Binimoi  hash  tabla 
;;;  aiza  is  enforced  hare. 

(dafun  DatarBina-Hash-Tabla-Siza  (Siza-Estiasta  Aaujc  Size) 

(if  (<  8iza-Estiaata  NIN.HA8H.TABLE.8IZB) 

(satq  Siza  NIN_NA8H_TABLE_8IZE) 

(satq  Siza  Slza-Estiasta) ) 

(if  (s  (aod  size  2)  0) 

(satq  Siza  (l4  Size))) 

(Dataraina-Haah-Tabla-8iza-l  Siza) ) 


(dafun  Dataraina-Hash-Tabla-Siza-l  (Siza) 

(if  (null  (Priaa-Nuabar-Tast  Size)) 

(Dataraina-Hash-Tabla-Siza-l  (-^^  Size  2) ) 

Size) ) 

(dafun  Priaa-Nuabar-Test  (Nuzibar) 

(let  ( (Index  3) ) 

(cond  ((s  NuaJoar  2)  t) 

((s  (aod  Nuaber  2)  0)  nil) 

(t  (Priae-Nuaber-Tast-1  Index  Nuabar) ))) ) 

(dafun  Priaa-NuaUdar-Test-l  (index  Number) 

(cond  ( (<s  (Square  Index)  Nuabar) 

(if  (s  (aod  Nuabar  Index)  0) 
r'.i) 

'.>atq  Index  (•*■  Index  2)) 

(Priaa-Nuabar-Taat-l  Index  Nuabar)) 

(t  t))) 

(dafun  square  (n)  (*  n  n) ) 

;;;  This  function  calculates  a  hash  table  index  from  a  key 
;;;  (syabol->string)  and  the  hash  table  size. 

(defun  Hash-Function  (Kay  Size) 

(let*  ((Sub  0) 

(Key-String  (string  Key)) 

(Length  (1-  (string-length  Key-String}))) 

(setq  Sua  (Hash-Function-1  Sua  Key-String  Length)) 

(aod  Sua  Size) ) ) 

(defun  Hash-Function-1  (Sua  Key-String  Length) 

(cond  ((<  Length  0) 

SUB) 

(t 

(setq  Sua 

(•f  Sua  (char-int  (aref  Key-String  Length)))) 
(setq  Length  (1-  Length)) 

(Hash-Function- 1  Stm  Key-String  Length)))) 
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8ynC4x:  CCMK>n*liq?;  B«m:  10.;  P«ck«o«:  USER 
;;;  CST  •iaulator  ■*-  origiiMl  version 
;;;  qu«u«  stuff 

(dsfvar  *d«fault’qu«u««0iz«*  16 
‘Initial  Quaua  Sisa'} 

(dafstruct  quaua 
(haad  0) 

(tall  0) 

(langth  0) 

(data-aiza  *dafault-quaua-aiza*) 

(data  (maka-array  *dafault>quaua-8iza*) J ) 

(dafun  quaua-firat  (quaua) 

(if  (>  (quaua-langth  quaua)  0) 

(araf  (quaua-data  quaua) 

(quaua-haad  quaua) ) ) ) 

(dafun  quaua-anpty?  (quaua) 

(zarop  (quaua-langth  quaua))) 

(dafun  quaua-liat  (quaua) 

(if  (quaua-aoqpty?  quaua) 

'  0 

(lat  ((data  (quaua-data  quaua)) 

(haad  (quaua-haad  quaua) ) 

(tail  (quaua-tail  quaua))) 

(if  (<  haad  tail) 

(loop  for  indax  fron  haad  balow  tail 
collact  (araf  data  indax)) 

(nconc  (loop  for  indax  froa  haad 

balow  (quaua-data-aiza  quaua) 
collact  (araf  data  indax)) 

(lo^  for  indax  froa  0  balow  tail 

collact  (araf  data  indax))))))) 

(dafun  anquaua  (quaua  obj) 

(lat*  ((tail  (quaua-tail  quaua)) 

(langth  (quaua-langth  quaua}) 

(data  (quaua-data  quaua) ) 

(old-aiza  (quaua-data-aiza  quaua) ) ) 

(if  (<  langth  (-  old-aiza  2)) 

(progn 

(aacf  (araf  data  tail)  obj) 

(aatf  (quaua-tail  quaua) 

(■od  (l4>  (quaua-tail  quaua)) 
old-aiza) ) 

(incf  (quaua-langth  quaua) ) ) 

(progn 

(adjuat-array  data  (*  old-aiza  2)) 

(aatf  (quaua-data-aiza  quaua) 

(*  old-aiza  2) ) 

(lat  ((haad  (quaua-haad  quaua) ) ) 

(if  (>  haad  tail)  ;;  othar  caaa  raquiraa  no 
(progn 

(loop  for  indax  fron  haad  balow  old-aiza 

do  (aatf  (araf  data  {*  old-aiza  indax)) 
(araf  data  indax))) 

(aatf  (quaua-haad  quaua) 

(♦  old-aiza  haad))))) 

(anquaua  quaua  obj ) ) ) ) ) 

(dafun  daquaua  (quaua) 

(if  (quaua-ai^ty?  quaua) 

(arror  ‘•-aAttaapt  to  daquaua  froa  an  aapty  quaua  ->8*  quaua) 
(progn 

(lat  ((alt  (araf  (quaua-data  quaua) 

(quaua-haad  quaua) ) ) ) 

(aatf  (quaua-haad  quaua) 

(mod  (1-f  (quaua-haad  quaua) ) 

(quaua-data-aiza  quaua))) 

(daef  (quaua-langth  quaua)} 
alt)))) 

;  coda  to  acoaaa  a  noda  daacriptor 
;;;  noda  ■  quaua  X  objaeta  X  oontaxta  X  mathod-eacha 

(dafatruct  noda 

(quaua  (aaka-quaua) ) 

(objaeta  (maka-array  32)) 

(contaxta  (maka-array  32)) 

(mathod-caoha  (maka-array  *mathod-caeha-aiza*) ) 

(buay -count  0) ) 

(dafvar  *nodaa*) 

(dafvar  *oontaxta*) 

(dafvar  *nr-nodaa*  2S6  *Muat  alao  changa  nmodaa  in  C8T  world*) 


(dafvar  *atap-quaua*)  ;  holds  maaaagaa  awaiting  dalivary 

(dafvar  *atap-nr*) 

(dafvar  ‘profile* )  ;  profiling  flag,  atatiatica  raeordad  idian  trua 

(dafvar  *profila-liat*) 

(dafvar  *log*  M)  *Naaaaga  Logging  Bnabla*) 

(dafvar  ‘trace*  '()  ‘Whether  or  not  %»a'ra  tracing*) 

(dafvar  ‘traca-aalactora*  '() 

*liat  of  aalactora  ««a'ra  tracing*) 

(dafvar  ‘mathod-cacha*  t) 

(dafvar  *mathod-cacha-aiza*  10} 

(dafvar  *mathod-eacha-traca*  '() 

‘Switch  for  mathod  cache  tracing*) 

(dafvar  *mathod-cacha-traea-liat*  '() 

‘Global  NC  Trace  list*) 

(dafvar  ‘matar-masaaga-quauaa*  M) 

‘Enable  maaaaga  quaua  size  tracing*) 

(dafvar  ‘maaaaga-quaua-traca*  M)) 

(dafun  gat-noda  (noda-nr) 

(araf  ‘nodes*  noda-nr) ) 

;  coda  to  access  a  massage 

mag  is  of  the  form  (mag  noda-nr  header  selector  obj-id  ergs) 

(dafun  naw-mag  (noda-nr  )iaadar  selector  receiver  ergs) 

(if  (listp  arga) 

(append  ' (mag  , noda-nr  .header  .selector  .receiver)  arga) 

'(mag  .noda-nr  .header  .selector  .receiver  .arga))) 

(defun  mag-node  (nag) 

(cadr  mag)) 

(defun  mag-header  (mag) 

(caddr  mag)) 

(dafun  mag-alocn  (n  mag) 

(nth  (♦  n  3)  mag)) 

(defun  mag-aelactor  (mag) 

(mag  -slotn  0  mag) ) 

(defun  mag-racaivar  (nag) 

(mag-alotn  1  nag) ) 

(defun  mag-arga  (mag) 

(nthedr  S  mag)) 

(dafun  mag-argn  (n  nag) 

(nth  n  (mag-arga  mag)}} 

(defun  ia-nag  (mag) 

(aq  (ear  nag)  'nag)) 

(dafun  mag-langth  (mag) 

(1-  (langth  nag) ) ) 

(dafun  deliver  nags  0 
(do  () 

((queue-empty?  *atap-quaua*) ) 

(lat*  ((mag  (daquaua  *atep-quaua*) ) 

(noda-nr  (mag-noda  mag) ) 

(node  (gat-noda  noda-nr) ) 

(q  (noda-quaua  node) ) ) 

(anquaua  q  nag) ) ) ) 

;;;  atap-nodaa  walks  through  the  nodes  and  attempts  to  run  a 

;  maaaaga  on  each  noda 

(dafun  atap-nodea  () 

(whan  ‘profile* 

(profila-atap) ) 

(whan  *log* 

(log-atap)) 

(idian  ‘trace* 

(raoord-traoad-aalaetora-  * traca-aalactora* ) ) 

(dalivar-naga) 

(idian  ‘metar-meaaaga-quauea* 

( racord-maaaaga  -quaua-data ) ) 

(dotimea  (x  *nr-nodaa‘) 

(atap-node  x)) 

(incf  ‘atap-nr*)) 
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;;  Run  until  no  nor*  work. 

(dofun  stop-don*  {} 

(if  (qu*u*-*iipty?  **t^-qu*u**) 

(do  ((i  0  (♦  i  1))) 

((or  (*  i  ‘nr-nod***) 

(not  (qu*u*-*>qpty?  (nod*-qu*u*  (got -nod*  i})))) 

(»  i  *nr-nodoo*) ) ) ) ) 

(dofun  otop-nodo  (nod*-nr) 

(lot*  ((nod*  (got -node  nodo-nr)) 

(q  (nodo-quouo  node) ) ) 

(if  (not  (quouo-oo^ty?  q) } 

(let  '(aRg  (dequeue  q) ) ) 

(incf  (node-busy-count  node)) 

(procoss-mfrg  Msg))))} 

(defun  sond-nsg  (nsg) 

(enqueue  *  stop-queue*  msg) ) 

(defun  cst-start  (init-asg) 

(send-msg  init-msg) 

(shell -go) ) 

(defun  shell-go  () 

(cond  ((step-done) 
nil) 

(t  (step-nodes) 

(shell-go))))) 

(defun  process-Bsg  (nsg) 

(if  *profil** 

(setq  *nr-Bsgs-received* 

(<*>  1  *nr-Bsgs-reeeived*) ) ) 

(let  ((header  (asg-header  asg) ) ) 

(case  header 

(send  (process-send  asg) ) 

(call  (process-call  asg) ) 

(new  (process-new  asg) ) 

(newco  (process-newco  sag) ) 

(reply  (process-reply  asg) ) ) 
nil)) 

; ; ;  new  creates  a  new  object  on  a  node 

;;;  new  is  of  the  fora  (new  class  reply-context  reply-slot) 

;;;  or  if  the  object  is  distributed,  a  count  aay  be  appended 
;;;  for  distributed  objects,  new-co  Bessages  are  sent  in  a  fanout 
;;;  tree  to  all  constituents. 

;;; 

(defun  process-new  (asg) 

(let*  ((class-naae  (asg-slotn  0  asg)) 

(reply-context  (asg-slotn  1  asg) ) 

(reply-slot  (asg-slotn  2  asg)) 

(diet  (class-difct  (get -class  class-naae) ) ) 

(id  (new-object  class-naae  (asg-node  asg)))] 

(if  dist 

(let  ((size  (asg-slotn  3  asg))) 

(init-distributed-object  id  size  (asg-node  asg) 

reply-context  reply-slot)) 
(reply-to-context  reply-context  reply-slot  id) ) ) ) 


;;;  on  a  reply,  stuff  data  into  slot  and  resuae  context 
;;;  aessag*  is  (reply  context -nr  slot -nr  data) 

;;;  if  value  is  a  value,  aust  allocate  copy 

(defun  process-reply  (asg) 

(let*  ((context -nr  (asg-slotn  0  asg)) 

(slot  (asg-slotn  1  asg)) 

(data  (asg-slotn  2  asg) ) 

(context  (get-context  context -nr) ) } 

(if  context 
(progn 

(set-slot  slot  context  data) 

(resuae-context  context-nr)l ) ) ) 

; ; ;  cod*  to  send  a  reply 

(defun  r^ly-to-context  (context-nr  slot  value) 

(let  ((asg  (new-asg  (context -to-node  context-nr) 

'reply  context-nr  slot  (list  value)))) 

(send-asg  asg) ) ) 

;;;<??>  handle  did  receiver 

;;;  send  creates  a  new  context  and  executes  the  first  stateaent 
;;;  if  receiver  is  not  atoalc,  look  up  class 

;;;  ids  are  referred  to  li)c*  '(id  3)  to  distinguish  thw  froa  the  integer  3. 

(def\in  process-send  (asg) 

(let*  ((receiver  (asg-receiver  asg) ) 

(node  (asg-node  asg)}) 

(cond  ((is-did  receiver) 

(let*  ((id  (did-on-node  receiver  node))) 

(if  id 

(process-noraal-send  asg  id) 

( forward-did-aessage  node  asg  receiver)))} 

((is-co  receiver) 

(let  ((id  (did-on-node  '(did  , (second  receiver))  node))) 
(process-noraal-send  asg  id))) 

((is-bloek  receiver) 

(process-block-send  asg) ) 

(t 

(process-noraal-send  asg  receiver))))) 

(defun  process-noraal-send  (asg  receiver) 

(let*  ((selector  (asg-selector  asg) ) 

(args  (asg-args  asg) ) ) 

(if  (is-id  receiver) 

(let*  ((id  (second  receiver)) 

(obj  (gec-oibject  id)) 

(class-naae  (object -class  obj) ) 

(cod*  (aethod-lookup  selector  class-naae) ) ) 

(start-code  code  asg  receiver  args) ) 

(let*  ((class-naae 

(cond  ((integerp  receiver)  'integer) 

((floatp  receiver)  'float) 

((syabolp  receiver)  'syabol))) 

(code  (aethod-lookip  selector  class-naae})) 

(start-code  code  asg  receiver  args))))) 

(defun  forward-did-aessage  (node  asg  receiver) 

(sotf  (second  asg)  (id-to-node  receiver}) 

(send-asg  asg)) 


(dsfun  init-distributed-object  (id  size  node  reply-context 

reply-slot) 


(let*  {(size  (if  size 

(ain  size  *nr-nodes*} 
default-distobj-size*) ) 

(did  (new-did  node  size))) 

(send-dist-init  nods  id  did  0  size  node  reply-context 
reply-slot))) 


(defun  send-dist-init  (node  id  did  index  size  root  reply-context 
reply-slot) 

(lot  ((asg  (new-asg  node  'send  'newco  id 

(list  index  size  root  reply-context 
reply-slot)))) 

(set -object -did  (get-object  (ref-id  id))  did) 

(send-asg  asg))) 


;;;  the  newco  aessage  is  a  hack  to  allow  distributed  object  to  be 
; ;  created. 


(defun  process-block-send  (asg) 

(let  ((block  (get -block  (blkid-get-id  (asg-receiver  asg) )) ) 
(selector  (asg-selector  asg) ) 

(args  (asg-args  asg) ) ) 

(if  (eq  selector  'value) 

(start -code  bloclc  asg  nil  args) 

(cst -error  *-fcblock  aessage  other  than  value  -S*  asg)))) 

(defun  start -code  (code  asg  receiver  args) 

(if  code 

(let  ( (nr-args  (block-nr-args  cod*))) 

(cond  ((■  (♦  nr-ergs  2) 

(length  args)) 

( start -s»ethod  (asg-noda  aag)  code  receiver  args)) 

(t 

(progn 

(cat-error  *-^ltrong  nuaber  of  arguaents  in  -8*  asg) 
(cst-error  *~&-S  actuals,  to  aateh  -S  foraals* 
args  nr-args) )))))) 


(dsfun  process-nswco  (asg) 

(let*  ((class-naae  (asg-slotn  0  asg)) 

(did  (asg-slotn  1  asg) ) 

(indsx  (asg-slotn  2  asg)) 

(size  (asg-slotn  3  asg) ) 

(root  (asg-slotn  4  asg) ) 

(rsply-eontext  (asg-slotn  S  asg) ) 

(reply-slot  (asg-slotn  4  asg) ) 

(id  (new-object  class-naaw  (asg-node  asg}))) 
(send-dist-init  (asg-node  asg)  Id  did  index  size 
root  reply-context  reply-slot) ) ) 


create  a  context,  copy  args  froa  aessage,  axecute  to  first  send 

(defun  start -aethod  (noda  coda  receiver  args) 

(let  ((cwtext-nr  (ref-id  (new-context  node  code  receiver)))) 
(copy-ergs  args  context-nr) 

(advance-context  context-nr))) 

(defun  copy-ergs  (args  context-nr) 

(let  ((context  (get -context  context-nr))) 

(loop  for  arg  in  args 
for  i  frca  0  do 

(set -context-slot  context  i  arg)))) 
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;;;  •dvanc*«  context  ov«r  iMXt  action 


(dafun  advanoa-oontoxt  (context -nr) 

(lot  ((noxt  (oxacuta-inatructlon  oontoxt-nr) ) ) 

(whan  *profila* 

(incf  *nr-icodaa-axacutadi*) ) 

(whan  *aathod-cacha* 

(lat*  ((noda-nr  (context -noda  (gat -context  context -nr) ) ) 
(node  (gat -node  noda-nr) ) 

(bloc)(  (context -coda  (gat -context  context -nr) ) ) ) 
(whan  *Mthod-cacha-traca* 

(let  ( (prav  (firat  *Mthod-cacha-traca-liat*) ) ) 

(if  (not  (and  (equal  (firat  prav) 

•atap-nr*) 

(equal  (aacond  prav) 
noda-nr) ) ) 

(puah  '(.*atap-nr*  , noda-nr  <  (bloc)c-id  block) 
, (length  (block-inata  block) ) ) 
*Bathod-cacha-traea-liat*) ) ) ) 

(whan  (not  (Mthod-cacha-praaent-p 
block 

(noda-sathod-eacha  node) ) ) 

(progn 

(incf  *nr-blocka-loadad*) 

(nathod-cacha-inaart  block 

(noda-Mthod-cacha  node)))))} 


(caaa  next 
(auQ>and  nil) 

(back-up  (back-up-context  context -nr)) 

(continue  (advance-context  context -nr) ) 

(diapoae  (reaova-context  context -nr) ) 

(otherwiae 

(cat-error  ‘-fclllagal  value  in  advance  context:-8* 
next)}))) 


;;;  <??>  other  opcodaa 

(defun  execute-inatruction  (context -nr) 

(let*  ((inat  (fetch-inatruction  context-nr)) 

(^coda  (ear  inat))) 

(if  ‘profile* 

(aetq  *nr-inata-axaeutad* 

(♦  (-  (length  inat)  1) 
*nr-inata-axaeutad*) } ) 

(axacute-inatruetion-1  inat  opcode  context-nr))) 

(dafun  exacuta-inatruction-1  (inat  opcode  context-nr) 
(eaae  opcode 
(Bove 

(exacuta-Bove  context-nr  inat)) 

((aend  caend  forward) 

(execute-aend  context-nr  inat)) 

((falaejuBp  juap) 

(executa-juBp  context-nr  inat)) 

(label 

'continue) 

((reply  reply-x) 

(execute-reply  context-nr  inat)) 

((return  retum-x) 

(execute-return  context-nr  inat)) 

;  iaploBanr  return  icodaa 
(reply-eonaole 

(execute-reply-conaole  context-nr  inat)) 
(echo-conaole 

(execute-echo-conaola  context-nr  inat)) 

(newco 

(execute-newco  context-nr  inat)) 


(execute-new  context-nr  inat}) 
(touch 

(execute-touch  context-nr  inat) ) 
(auapend 
'auapend) 

(exit 

'diapoae))) 


(dafun  execute-touch  (context-nr  inat) 

(let*  ((context  (get -context  context-nr)) 

(ref  (aecond  inat))) 

(if  (equal  (get-alot  ref  o<mtext)  'c-fut) 
'baok-«9 
'continue) ) ) 

;  aenda  away  for  a  new  object 

(defun  execute-new  (context-nr  inat) 

(let*  ((context  (gat -context  context-nr)) 
(claaa-naaM  (caddr  inat)) 

(daat  (cadr  inat)) 

(also  (get-alot  (cadddr  inat)  context)}) 
(if  (aq  olaaa-naBO  'array) 

(progn 

(aet-alot  daat  context 


new-arr^  (context -node  context)  aize) 

'CMitinue) 

(progn 

(aet-alot  daat  context  'c-fut) 

(cat-new  claaa-naBa  context-nr  daat  aize) 

'auapand) ) ) ) 

createa  a  conatitutant  of  a  diatributad  ob3eet 

(dafun  axacuta-nawco  (context -nr  inat) 

(let*  ((context  (get -context  context-nr)) 

(alot  (cadr  inat)) 

(arga  (napear  I'daBbda  (x) 

(get-alot  x  context)) 

(eddr  inat) ) ) 

(object  (get -object  (ref-id  (context-receiver  context) )) ) 
(claaa  (object-elaaa  object)) 

(did  (object -did  object) ) 

(Bag  (naw-Bag  (car  arga)  'na%reo  claaa  did 

(append  (edr  arga)  (liat  context-nr  alot))))} 
(aet-alot  alot  context  'c-fut) 

(aend-Bag  Bag) 

'continue) ) 

(dafun  axecuta-juBp  (context -nr  inat) 

(let*  ((opcode  (car  inat)}) 

(caaa  opcode 
(falaajuap 

(if  (aq  (get-alot  (cadr  inat) 

(get -context  context-nr)) 

' falae) 

(do-juap  context-nr  (caddr  inat) ) 

'continue) ) 

(jUBp 

(do-juap  context-nr  (cadr  inat)))))) 

(dafun  do-juap  (context-nr  target) 

(let*  ((context  (get-contaxt  context-nr)) 

(coda  (block-inata  ( context -coda  context)))) 

(aat-contaxt-ip  context 

( f ind-juBp-target  coda  target  0)) 

'continue) ) 

(dafun  find-juap-targat  (coda  target  nr) 

(if  coda 

(let*  ((atat  (ear  coda) 

(type  (car  atat))) 

(if  (and  (aq  type  'label) 

(b  (cadr  atat)  target)) 
nr 

(find-jui^-target  (odr  coda)  target  (♦  nr  1)))))) 

;;;  doea  a  priBOp  or  aenda  a  aeaaage 

(dafun  axecuta-aend  (context -nr  inat) 

(let*  ((opcode  (firat  inat)) 

(context  (get -context  context-nr) ) 

(operation 

(let  ((oper  (third  inat))) 

(if  (ayabolp  oper) 
oper 

(get-alot  oper  (get -context  context-nr))))) 

(rarga  (odddr  inat)) 

(raply-to 
(caae  opcode 
( (aend  caend) 

(cona  context -nr  (aecond  inat))) 

(forward 

(get-alot  (aacond  inat)  context))))) 

(baalc-aend  opcode  context-nr  operation  rarga  raply-to))) 

;;  if  the  operation  ia  priBitive,  do^it  and  continue 
otherwiae,  actually  do  a  Baaaage  aend 

(dafun  baaic-aend  (opcoda  context-nr  oparatiMi  rarga  raply-to) 

(let*  ((context  (get -context  context-nr)) 

(all-arga  (Bapear  I'danMa  (x) 

(get-alot  x  context)) 

rarga) ) 

(node  (context -node  context)) 

(daat  (edr  raply-to)) 

(op  (ia-priBitive  operation  all-arga))} 

(if  (Baaber  'c-fut  all-arga) 

'back-up 
(if  (and  op 

(equal  (car  reply-to)  context-nr) ) 

(progn 

(aet-alot  daat  o^text  (apply  op  all-arga)) 

'continue) 

(progn 

(cat-aend  node  (oar  all-arga) 

operatimi  (odr  all-arga) 

(ear  reply-to)  (edr  raply-to) ) 
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(e«s«  opcod* 

(••nd 

(••t-slot  d«9C  context  'c-futj 
' suspend) 

(cssnd 

(sst-slot  dest  context  'c-fut) 

'continue) 

(forward 

'continue) )))))) 

(defun  execute-Bove  (context-nr  inst) 

(let*  ((context  (get -context  context -nr)) 

(dest  (second  instjj 
(src  (third  inst))) 

(set-slot  dest  context  (get-slot  src  context)) 

'continue) ) 

;  Reply  sends  the  result  and  exits  the  context 

(defun  execute-reply  (context -nr  inst) 

(let*  ((context  (get-context  context -nr)) 

(reply-context  (context -reply-context  context)) 

(reply-slot  (context -reply-slot  context)) 

(value  (get-slot  (cadr  inst)  context))) 

(if  reply-context 
(case  reply-context 
(console 

(cst -display  value] ) 

(otherwise 

(when  reply-slot 

(reply-to-context  reply-context  reply-slot  value))))) 
'dispose) ) 

;;;  Return  sends  the  result  and  continues  to  run  in  the  context 

(defun  execute-return  (context -nr  inst) 

(let*  ((context  (get -context  cent ext -nr)) 

(reply-context  (context-reply-context  context)} 

(r^ly-slot  (context-reply-slot  context)) 

(value  (get -slot  (cadr  inst)  context))) 

(if  reply-context 
(case  reply-context 
(console 

(est-display  value) ) 

(otherwise 

(«dien  reply-slot 

(reply-to-context  reply-context  reply-slot  value))))) 
'continue) ) 

(defun  execute-reply-console  (context-nr  inst) 

(let*  ((context  (get -context  context -nr)) 

(value  (get-slot  (cadr  inst)  context))) 

(est-display  value) 

'di^^wse) ) 

(defun  execute-echo-console  (context-nr  inst) 

(let*  ((context  (get -context  context-nr)) 

(val-list 

(loop  for  val  in  (rest  inst) 

collecting  (get -slot  val  context)))) 
(cst-display-list  val-list)) 

'continue) 

;;;  returns  s  nuaerical  offset  into  a  context's  arg/var  list 

(defun  cosipute-slot  (slot  context) 

(let  ((type  (car  slot)) 

(indsx  (cadr  slot)} 

(cods  (context -cods  contsxt))} 

(case  type 
(var 

(♦  index 

2 

(block-nr-ergs  code) ) ) 

(erg 

index) 

(■*>  index 

2 

(block-nr-args  code) 

(block-nr-vars  code) } ) 

(otherwise 

(cst -error  *-fc81ot  wmt  .  be  teap,  var,  or  erg:  -S* 
slot))})) 

;;;  gsts  a  slot  a.g.,  (ivsr  0} 

;;;  <77>  fix  const  and  global 

(dsfun  get-slot  (slot  context) 

(If  (listp  slot) 

(let  ((type  (cer  slot)) 

(index  (cadr  slot})) 


( ivar 

(object-ivar 

(gat-ob^ect  (ref-id  (context-receiver  context))) 
index) ) 

((arg  ver  teap) 

(let  ( (n  (coapute-slot  slot  context)) 

(context-slot  context  n) ) ) 

(block 

alot) 

(global 

(get -global  index)) 

(conat 
indax) ) ) 

(case  slot 
(self 

(context -recsiver  context)). 

(group 

(object -did 

(get-object  (ref-id  (context -receiver  context))))) 
(requester 

(cons  (context-reply-eontext  context) 

(context-reply-slot  context)))))) 

; ; ;  sets  a  slot 

(dafun  sac-slot  (slot  context  value) 

(let  ((type  (car  slot)) 

(indax  (cadr  slot))) 

(casa  type 

( (arg  var  tai^) 

(let  ((n  (coapute-slot  slot  context))) 

(set-context-slot  context  n  value))) 

(ivar 

(aat-objact-ivar 

(get-objact  (raf-id  (context -receiver  context) ) ) 

index 

value) ) 

(global 

(aat -global  index  value)) 

('() 

'())  ;;  do  nothing  if  it's  nil 

(otherwise 

(cst -error  *-RSloc  error  -S*  slot))))) 

<??>  -  tsaporary  hack  to  inpleaent  globals  need  to  generate 
;;;  code  to  send  and  receive 

(defun  set-globel  (naae  value) 

(let*  ((cell  (assoc  naae  *globals*))) 

(if  cell 

(xplacd  (edr  cell)  value) 

(cst-error  *-4unknown  global  -S*  naaa)))) 

(defun  get-globel  (naae) 

(let*  ((cell  (assoc  naae  *globels*)l) 

(if  cell 

(eddr  cell) 

(cst-error  *>-4unknown  global  -S*  naae)))) 

(defun  fetch-instruction  (contsxt -nr) 

(let*  ((context  (get -context  context-nr)) 

(ip  (context-ip  context)) 

(inst  (block-inst  ip  (context -code  context)))) 

(set -context -ip  context  (♦  1  ip)) 
inst}} 

(defun  next-instruction  (context) 

(let  ((ip  (context -ip  context))) 

(block-inst  ip  (context-code  context)))) 

(defun  back-up-context  (context-nr) 

(let*  ((context  (get-context  context-nr)) 

(ip  (context-ip  context))) 

(set -context-ip  context  (-  ip  1}})) 

;  resuaes  a  suspended  context 

(defun  resuae-context  (context-nr) 

(advance-context  context-nr) ) 

(defun  init-nodes  () 

(setq  *step-queue*  (aake-queue) ) 

(aecq  *nodes*  (aake-arr^  *nr-nodes*)) 

(dotiaes  (x  *nr-nodes*) 

(setf  (eref  *nodes*  x)  (aeke-node) ) ) ) 

(defun  is-node  (node) 

<node-p  node) ) 

(defun  randoa-neda  ( ) 

(randoa  *nr-nodes*)) 

(defun  print-nods  (nods-nr)  • 


(esse  type 
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(l«t  (<nod«  (g«t>nod«  nod*>nr) ) ) 

(format  * standard-output* 

•-4NOOB  -s  Qom  «s  oarBCTs  -s  cowr&xTs  -s* 

noda-nr  (noda-quaua  noda) 

(noda-objacto  noda)  (noda-contaxta  noda) ) ) ) 

(dafun  init-contaxta  () 

(aatf  *contaxta*  (maka-array  *init-nr-contaxta*  tadjuatabla  t)) 
(aatf  •nr-contaxta*  'init-nr-contaxta*) 

(aatf  *naxt-contaxt*  0) 

(aatf  *fraa-contaxta*  (maka-atack)) 

(aatf  *contaxt-atata-raaourca*  (maka-array-raaourca) ) ) 

(dafun  initial-contaxt  (nr-alota) 

(gat-array  *contaxt-atata-raaourca*  nr-alota)) 

(dafun  contaxt-nr  (contaxt) 

(nth  1  contaxt)) 

(dafun  contaxt-noda  (contaxt) 

(nth  2  contaxt)) 

(dafun  contaxt-coda  (contaxt) 

(nth  3  contaxt) ) 

(dafun  contaxt-ip  (contaxt) 

(nth  4  contaxt) ) 

(dafun  aat -contaxt -ip  (contaxt  x) 

(aatf  (nth  4  contaxt)  x)) 

(defun  contaxt-atata  (contaxt) 

(nth  5  contaxt)) 

(dafun  contaxt-raeaivar  (contaxt) 

(nth  6  contaxt)) 

(dafun  contaxt-alot  (contaxt  n) 

(araf  (contaxt-atata  contaxt)  n) ) 

(dafun  aat -contaxt-alot  (contaxt  n  x) 

(aatf  (araf  (contaxt-atata  contaxt)  n)  x)) 

(dafun  contaxt-raply-contaxt  (contaxt) 

(contaxt-alot  contaxt 

(block-nr-arga  (contaxt -cod^  contaxt)))) 

(dafun  aat -contaxt-raply-contaxt  (contaxt  x) 

(aat -contaxt-alot  contaxt 

(block-nr-arga  ( contaxt -coda  contaxt) ) 

X)) 

(dafun  contaxt-raply-alot  (contaxt) 

(contaxt-alot  contaxt 

(4  1  (block-nr-arga  (contaxt-coda  contaxt))))) 

(dafun  aat -contaxt-raply-alot  (contaxt  x) 

(aat -contaxt-alot  contaxt 

(4  1  (block-nr-arga  (contaxt-coda  contaxt))) 

X)) 

(dafun  gat-contaxt  (contaxt-nr) 

(araf  *contaxta*  contaxt-nr) ) 

(dafun  contaxt-to-noda  (contaxt-nr) 

(contaxt-noda  (gat-contaxt  contaxt-nr))) 

(dafun  find-contaxt  (c-nr  c-liat) 

(loop  for  cOTtaxt  in  e-liat 

until  {•  c-nr  (contaxt-nr  contaxt)) 
finally  (ratum  contaxt))) 

(dafun  liva-contaxta  () 

(loop  for  indax  from  0  balom  (langth  *contaxta*) 
whan  (araf  *contaxta*  indax) 

collact  (araf  *CMtaxta*  indax))) 

(dafun  contaxt -mat hod  (contaxt) 

(block-mathod  (block-id  (contaxt-coda  contaxt)))) 


A  block  idantifiar  abstraction 
a  block  id  is  (block  blkaymbol) 


(dafun  aaka-blkid  () 

(ganaym  •BLOCK*)) 

(dafun  blkid-gat-id  (blkid) 
(cadr  blkid)) 

(dafun  ia-blkid  (id) 

(aqual  (car  id)  'block)) 


(dafun  block^wthod  (blkid) 

(loop  for  mathod  in  *mathods* 

whan  (aq  (caddr  mathod)  blkid) 
ratum  mathod) ) 

(dafvar  *blocka*  M) 

•leoda  blocks*) 

(dafun  gat -block  (block-tag) 

(aaaoc  block-tag  *blocka*)) 

(dafun  block-id  (block) 

(car  block) ) 

(dafun  block-nr-arga  (block) 

(cadr  block) ) 

(dafun  block-nr-vars  (block) 

(caddr  block) ) 

(dafun  block-nr-tai^a  (block) 

(cadddr  block) ) 

(dafun  block-inats  (block) 

(nth  4  block)) 

(dafun  block-inat  (n  block) 

(nth  n  (block-inata  block) ) ) 

;  ;  ratuma  tha  coda 

(dafun  mathod-lookup  (aalactor  claas-nama) 

(lot  ((mathod  (mathod-lookupl  aalactor  claaa-nama) ) ) 

(if  (null  mathod) 

(progn 

(format  *atandard-output* 

*~4maasaga  -S  not  xaplamantad  for  class  •S* 
aalactor  claaa-nama) 

M)) 

mathod) ) ) 

(dafun  mathod-lookupl  (aalactor  claaa-nama) 

(lat*  ((class  (gat -class  claaa-nama))) 

(if  class 

(lat*  ((aupars  (claaa-aupara  class) ) 

(mathoda  (elaaa-mathoda  class) ) 

(mathod  (assoc  salactor  mathods) ) ) 

(If  mathod 

(gat -block  (caddr  mathod)) 

(if  (or  (not  (listp  aupars)) 

(sq  class-nama  'objact) 

(aq  class-nama  nil)) 

M) 

(mathod-lookupl  salactor  (car  aupars))}))))} 

(dafvar  *classas*  M) 

*Class  Structura  and  mathoda*} 

(dafun  gat-class  (elass-naam) 

(lat  ((class  (assoc  class-nama  *elassas*})) 

(if  class 

class 

(cst-arror  ••^ondafinad  Class  -S*  claaa-nama)))) 

(dafun  claaa-nama  (class) 

(car  class)) 

(dafun  claaa-aupara  (class) 

(cadr  class) )  • 

(dafun  claaa-vars  (class) 

(caddr  class) ) 

(dafun  elaaa-mathoda  (class) 

(cadddr  class)) 

(dafun  elasa-dist  (class) 

(fifth  class)) 

(dafvar  •objacta*  nil) 

(dafun  gat-^jact  (id) 

(araf  •eb^aeta*  id)) 

(dafun  objact-id  (obj) 

(sacond  obj) ) 

(dafun  objact-did  (ebj) 

(third  obj) ) 

(dafun  sat-objaet-did  (obj  x) 

(satf  (third  obj)  x) ) 
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(d«fun  obj«cC'node  (obj) 
(fourth  obj) ) 


noW'Asgs 

(puah  lliat  ••tap-nr*  naW'Maga)  •tr»ce-li»t*) ) ) ) 


(dafun  objact'claaa  (obj) 

(fifth  obj) ) 

(dafun  ob}«ct-8tata  (obj) 

(aixth  obj) ) 

(dafun  objact'ivar  (obj  n) 

(nth  n  (objact'Stata  obj})) 

(dafun  aat-objact-ivar  (obj  n  x) 

(aatf  (nth  n  (objact-stata  obj))  x) ) 

(dafun  is-objact  (obj) 

(aq  (car  obj)  'objact)) 

(dafun  ia-id  (raf) 

(and  (liatp  raf) 

(aq  (car  raf)  ' id) ) ) 

(dafun  ia-did  (raf) 

(and  (liatp  raf) 

(aq  (car  raf)  'did))) 

(defun  ia-eo  (raf) 

(and  (liatp  raf) 

(aq  (car  raf)  'co) } ) 

(dafun  ia>block  (raf) 

(and  (liatp  raf) 

(aq  (car  raf)  'block))) 

(dafun  raf>id  (raf) 

(cadr  raf)) 

(dafun  cat-arror  (atring  Ctraat  arga) 

(apply  •' format  *8tandard-output*  atring  arga) 
nil) 

(dafun  cat^diaplayliat  (aliat) 

(format  *atandard-output*  *-4-30:  •  ‘atap-nr*) 

(loop  for  val  in  aliat 

do  (cat'displayl  val))) 

(dafun  eat-diaplay  (value) 

(format  *atandard«output*  *  •atap-nr*) 

(eat'diaplay*!  value)) 

(dafun  cat>diaplay-l  (value) 

(cond  ((liatp  value) 

(let  ((type  (car  value)) 

(index  (cadr  value))) 

(caaa  type 
(id 

(format  ^atandard-output"  *  ^S*  (gat -object  index))) 
(otharwi aa 

(format  *8tandard-output*  *  -S*  value))))) 

( (arrayp  value) 

(di^lay-array  value) ) 

(t 

(format  *atandard-output*  *  •>$*  value)))) 

(dafun  diaplay-array  (value) 

(let  ((y  nil) ) 

(dotiaaa  (x  (length  value)) 

(aatq  y  (cona  (aref  value  x)  y))) 

(format  *atandard-output*  •  -s*  (raveraa  y)))) 


;;  Filter  out  the  traced  aalactora 

(dafun  aalactivaly-copy-tracad  (aal-liat  magliat) 

(loop  for  nag  in  magliat 

idtan  (mambar  (mag-aalactor  mag)  aal-liat)  collect  mag  into  reault 
finally  (return  reault))) 

(dafvar  *nr-mag8-racaivad*  0 

*Numbar  of  maga  received  in  the  currant  tine  atap*) 

(dafvar  *nr-inata-axacutad*  0 

*lnata  executed,  currant  time  atap*) 

(dafvar  ^nr-icodaa-axacutad*  0 

"Icodaa,  currant  time  atap*) 

(dafvar  *nr-blocka-loadad*  0 

‘Number  of  Method  Cache  miaaaa,  currant  tine  atap*  ) 

(dafun  profila-atap  () 

(puah  (maka-profila-frama  *atap-nr* 

(queue-length  *8tap-quaue* ) 

‘nr-msga-racalved* 

‘nr-inata-axacutad* 

* n r- i codas -axacu t ad * 

‘nr-blocka-loadad* 

( a vg -queue - 1 ang t h ) 

(total-maaaaga-langth) ) 

•prof ila-liat* ) 

(aatf  *nr-inats-axacutad*  0) 

(aatf  *nr-icodaa-axacutad*  0} 

(aatf  *nr-blocka-loadad*  0) 

(aatf  *nr-maga-racaivad*  0)) 

(dafun  maka-profila-frama  (time-step  maga-naw  msga-done 

inats-axec  Icodea-exac  blocka-loaded 
avg-q-langth  maga-worda) 

(list  tima-atap  maga-naw  maga-dona 

inata-axac  icodas-axac  blocka-loaded 
avg-q-langth  maga-iMrda) ) 

(dafun  racord-maaaaga-quaua-data  () 

(puah  (cona  *atap-nr* 

(loop  for  index  from  0  below  ‘nr-nodea* 
with  mqlan  s  0 
unless  (zarop 

(aatf  mqlan 

(loop  for  massage 

in  (queue-list 

(noda-quaue  (gat-node  index))) 
sum  (msg-langth  massage) ) ) ) 
collect  (list  index  mqlan) ) } 

*massaga-quaua-traca* ) ) 

(dafun  avg-quaua-langth  {) 

(let  ((tql  0)) 

(dotimaa  (x  *nr-nodas*) 

(aatq  tql 

(♦  tql 

(queue-length  (node-queue  (gat -node  x) ) ) ) ) ) 

(/  tql  ‘nr-nodas*) ) ) 

(dafun  total-maasaga-langth  () 

(reduce  •'♦ 

(m^car  •'massaga-langth  (queue-list  *Btap-quaua*) ) ) ) 


;;  statistics  functions  (dafun  massaga-langth  (massage) 

(-  (length  massage)  2)) 

(dafvar  ‘log-list*  '() 

‘Log  of  Naasagaa*) 


log  all  maaaagaa  this  stop 

(dafun  log-step  () 

(puah  (liat  *atap-nr* 

(copy-list  (queue-list  *stap-quaua*) ) ) 
♦log-list*)) 

(dafvar  *traca-list*  M) 

‘Maasagas  «M'va  recorded*) 


record  traced  maaaagaa  thia  atap 

(dafun  racord-tracad-malactora  (traced) 

(let  ( (naw-mmgs 

(aalactivaly-copy-tracad  traced 

(quaua-liat  *atap-quaua*) ) ) ) 
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Syntax:  Cooaon'li^;  Basa:  10.;  Paekaga:  USSR 
;;;  C8T  aisulator  —  functional  varaion 

;;;  quaua  stuff 

(dafvar  *dafault*quaua-aixa*  16  "Initial  Quaua  Siaa") 

(dafatruct  quaua 
(haad  0) 

(tall  0) 

(langth  0) 

(data>ai2a  "dafault-quaua-aisa") 

(data  (aaka-array  *dafault-quaua-siza*  :adjuatabla  t)}} 

(dafun  quaua-firat  (quaua) 

(if  (>  (quaua-langth  quaua)  0) 

(araf  (quaua-data  quaua)  (quaua-haad  quaua) )) ) 

(dafun  quaua-anpty?  (quaua) 

(s  (quaua-langth  quaua)  0)) 

(dafun  quaua-liat  (quaua) 

(if  (quaua>aapty?  quaua) 

'  0 

(lat  ((data  (quaua-data  quaua) ) 

(haad  (quaua-haad  quaua) ) 

(tail  (quaua-tail  quaua))) 

(if  (<  haad  tail) 

(lat  ((indax  haad) 

(list  nil) 

(and-indax  tail)) 

(quaua-liat-1  indax  and-indax  data  liat)) 

(append 

(let  ((index  haad) 

(liat  nil) 

(and-indax  (quaua-data-siza  queue))) 
(quaua-liat-1  indax  and-indax  liat)) 

(lat  ((index  0) 

(liat  nil) 

(and-indax  tail)) 

(quaua-liat-1  indax  and-indax  liat))))))) 

(dafun  quaua-liat-1  (indax  and-indax  data  liat) 

(cond  ((not  (<  indax  and-indax)) 
liat) 

(t  (aatq  liat  (cona  (araf  data  indax)  liat)) 

(aatq  indax  (!♦  index)) 

(quaua-liat-1  indax  and-indax  data  liat)))) 

(dafun  anquaua  (quaua  obj) 

(lat*  ((length  (quaua-langth  quaua) ] 

(old-aiza  (quaua-data-aiza  quaua}] 

(big-anough-quaua 

(if  (<  langth  (1-  old-aiza)) 
quaua 

(grow-quaua  quaua) ) ) ) 

(anquaua-baaa  big-anough-quaua  obj))) 

(dafun  anquaua-baaa  (quaua  obj) 

(lat  ((old-aiza  (quaua-data-aiza  quaua) ) ) 

(aatq  quaua 

(aaka-quaua  :haad  (quaua-haad  queue) 

:tail  (quaua-tail  queue) 
t length  (queue-length  queue) 

:data-aize  (queue-data-aize  queue) 

:data  (copy-replace-elt  obj 

(queue-tail  queue) 
(queue-data  queue)})) 

(aetq  queue 

(Mke-queua  :head  (queue-head  queue) 

:tail  (iMd  (1-f  (queue-tail  queue}) 
old-aize) 

t length  (quaua-langth  ^eua) 
tdata-aiza  (quaua-data-aiza  queue) 
tdata  (queue-data  queue))) 

(aetq  queue 

(Mka-queue  :head  (queue-head  queue) 

:tail  (queue-tail  queue) 

:length  (!♦  (queue-length  queue) ) 

:data-aisa  (queue-data-aize  queue) 
tdata  (queue-data  queue) ) ) 

queue) ) 

(dafun  grow-que\ie  (queue) 

(let*  ((old-aiza  (queue-data-aize  queue)) 

(new-aiza  (*  old-aiza  3)) 

(old-data  (queue-data  queue) ) 

(nee-data  (Mke-array  new-aize}) 

(head  (queue-head  queue)) 

(nuBiber-eleaanta  (queue-length  queue) ) ) 

(aetq  nee-data 

(copy-over-alta 

old-data  nee-data  head  old-aiza  nunber-alaaenta) ) 


(aetq  queue 

(aaka-quaua  :haad  0 

:tail  (quaua-tail  quaua) 

: langth  (queue-length  quaua) 

:data-aiza  (quaua-data-aiza  quaua) 

:data  (queue-data  queue))) 

(aatq  quaua 

(■aka-quaua  thaad  (queue-head  queue) 

:tail  nuabar-alaaianta 
: langth  (quaua-langth  queue) 

:data-siza  (quaua-data-aiza  quaua) 

:data  (queue-data  queue})) 

(aatq  quaua 

(naka-quaua  :haad  (quaua-haad  quaua) 

:tail  (quaua-tail  quaua) 

: langth  nuabar-aleaanta 
:data-aiza  (quaua-data-aiza  queue) 

;data  (quaua-data  quaua) ) ) 

(aatq  quaua 

(aaka-quaua  :haad  (quaua-haad  quaua) 

:tail  (quaua-tail  quaua) 

: langth  (quaua-langth  quaua) 

:data-aiza  (*  old-aiza  2) 

:data  (quaua-data  quaua) ) ) 

(aatq  quaua 

(aaka-quaua  :haad  (quaua-haad  quaua) 

:tail  (quaua-tail  quaua) 

: langth  (quaua-langth  quaua) 

:data-aiza  (quaua-data-aiza  quaua) 

:data  nae-data) ) ) ) 

(dafun  copy-over-alta  (old-data  nae-data  fron  old-aiza  nuabar-alaeanta) 
(c^py-ovar-alta-1  old-data  nae-data  0  froe  old-aiza  nuabar-alaeanta) ) 

(dafun  copy-ovar-alta-l  (old-data  nae-data  nae-indax  froa  old-aiza 
nuabar-alananta) 

(cond  ((>3  nae-indax  nuabar-alaaanta) 
nae-data) 

(t  (copy-over-alta-1 
old-data 

(copy-raplace-alt 

(araf  old-data  (nod  from  nae-indax)  old-aiza)) 

nae-indax 

nee-data) 

(!♦  nae-indax) 
froa 

old-aiza 

nuabar-alaaanta) ) ) ) 

(dafun  dequeue  (quaua) 

(let  ((alt  (araf  (queue-data  queue) 

(queue-head  queue) ) ) ) 

(aatq  queue  (aaka-quaua  thaad  (aod  (!♦  (quaua-haad  quaua) ) 

(quaua-data-aiza  queue) ) 
ttail  (queue-tail  queue) 
t length  (queue-length  queue) 

:data-aize  (queue-data-aize  queue) 
tdata  (queue-data  queue))} 

(aetq  quaua 

(■aka-quaua  thaad  (queue-head  queue) 
ttail  (queue-tail  queue) 
tlangth  (1-  (queue-length  queue) ) 
tdata-aiza  (queue-data-aize  queue) 
tdata  (queue-data  queue) ) ) 

(values  alt  queue) ) ) 

;;;  code  to  access  a  node  descriptor 

;;;  node  «  quaua  X  objects  Z  contexts  X  aathod-cache 

(dafatruct  node 

(queue  (Bake-queue) ) 

(objects  (Bake-array  32}} 

(contexts  (Bake-array  32) ) 

(■ethod-cache  (Bake-array  *Bathod-caehe-alza*) ) 

(busy-count  0))) 

(dafatruct  Bag 

(node  nil)  ;;  a  node  nuaber 
(header  nil) 

(selector  nil) 

(receiver  nil) 

(arga  nil))  ;;  a  liat 

(dafatruct  context 
(nr  nil) 

(node  nil) 

(coda  nil) 

(ip  nil) 

(state  nil) 

(receiver  nil) ) 

(dafatruct  block 
(id  nil) 
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(nr-arg0  nil) 

(nr-vars  nil) 

(nr-tanp«  nil) 

(inata  nil) ) 

(dafatruct  claaa 
(naaa  nil) 

(aupara  nil) 

(vara  nil) 

(nathoda  nil) 

(diac  nil) ) 

(dafscruct  objacc 
(id  nil) 

(did  nil) 

(noda  nil) 

(claaa  nil) 

(acata  nil) ) 

(dafun  objact-ivar  (obj  n) 

(nth  n  (objact*atata  obj))) 

(dafun  ia-objact  (obj) 

(obj act -p  obj)) 

(dafun  bloclc-inat  (n  block) 

(nth  n  (block-^inata  block))) 

(dafvar  *noda8*) 

(dafvar  *contaxta*) 

(dafvar  *stap-quaua*) 

(dafvar  •atap-nr*) 

(dafvar  *nr>noda8*  256  "Muat  alao  changa  nmodaa  in  CST  «rarld*) 
(dafvar  *profila*) ;profiling  flag,  atatiatica  racordad  vdian  trua. 
(dafvar  *profila-liat*) 

(dafvar  *log*  '())  ;  aaaaaga  logging  anabla 

(dafvar  ♦traca*  M)  'whathar  or  not  wa'ra  tracing*)) 

(dafvar  *traca>aalactora*  M)  *Liat  of  aalactora  laa'ra  tracing*) 

(dafvar  *atathod-cacha*  t) 

(dafvar  *»athod~cacha*ai2a*  10) 

(dafvar  •■athod-cacha-traca*  '() 

*Switch  for  aathod  cacha  tracing*) 

(dafvar  *Bathod-cacha>traca-liat*  '() 

*Global  NC  Traca  liat*) 

(dafvar  *aatar-Maaaga>guauaa*  '() 

*Bnabla  aaaaaga  quaua  aiaa  tracing*) 

(dafvar  *aaaaaga-quaua'traca*  '()) 

(dafvar  *blocka*  '() 

*Icoda  blocka*) 

(dafvar  *claaaaa*  '() 

'Claaa  structure  and  aathoda*) 

(dafvar  •objacta*) 

(dafun  gat-noda  (noda>nr) 

(araf  *nodaa*  noda-nr) ) 

(dafun  gat-block  (block-tag) 

(aaaoc  block-tag  *blocka*)) 

(dafun  gat-claaa  (claaa-naaM) 

(lat  ((claaa  (aaaoc  claaa-naaa  *claaaaa*))) 

(if  claaa 

claaa 


(dafun  dalivar-naga  () 

(cond  ( (quaua-aoqpty?  *atap-quaua*) 
nil) 

(t  (nultipla-valua-bind  (sag  naw-etap-guaua) 

(daquaua  *8tap-quaua*} 

(aatq  *8tap-quaua*  naw-atap-quaua) 

(lat*  ((noda-nr  (nag-noda  nag)) 

(noda  (gat-noda  noda-nr) ) 

(q  (noda-quaua  noda) ) 

(naw-q  (anquaua  q  sag) ) 

(naw-noda 

(naka-noda  :quaue  naw-q 

:object8  (noda-objacta  node) 
:contaxt8  (noda-contaxta  node) 
:iiathod-cacha 
(noda-aiathod-cache  noda) 

:bu  ay-count 

(noda-buay-count  noda) ) ) ) 

(aatq  'nodaa* 

(copy-raplaca-alt  naw-noda  noda-nr  *noda8*}}}) 
(dalivar-naga) ) ) ) 

;;;  atap-nodaa  walka  through  the  nodaa  and  attaa^ta  to  run  a  nasaaga 
; ; ;  on  each  noda 

(dafun  atap-nodaa  () 

(whan  'profile* 

(profila-atap) ) 

(tdian  *log* 

(log-atap)) 

(whan  ‘trace* 

(raeord-tracad-aalactora  'traca-aflactora*) ) 

(dalivar-naga) 

(«rtian  'watar-aaaaaga-quauaa* 

(racord-naaaaga-queua-data) ) 

(icarativaly-atap-nodaa  0) 

(aatq  *8tap-nr*  (!♦  *8tap-nr*))) 

(dafun  itarativaly-atap-nodaa  (x) 

(if  (>s  X  (array-total-aiza  ‘nodaa*)) 
nil 

(atap-noda  x) 

(itarativaly-atap-nodaa  {1*  x}))) 

;;  Run  until  no  Bora  work. 

(dafun  atap-dona  () 

(if  (quaua-aanpty?  *atap-quaua*) 

(nodaa-unaaplbyad?  0) 
nil)) 

(dafun  nodaa-unaaployad?  (i) 

(cond  ((>s  i  (arr^y-total-aiza  *nodaa*)) 
t) 

( (quaua-aBf>ty?  (noda-quaua  (gat-noda  i))) 

(noda8-uneB6>loyad?  (4  i  1))) 

(t  nil))) 

(dafun  atap-noda  (noda-nr) 

(lat*  ((node  (gat-noda  noda-nr)) 

(q  (noda-quaua  noda) ) ) 

(if  (quaua-ai^ty?  q) 
nil 

(■ultipla-valua-bind  (wag  naw-quaua) 

(dequeue  q) 

(aatq  noda 

(sake -noda  :  quaua  naw-quaua 

robjacta  (noda-objacta  noda) 

:contaxt8  (noda-contaxta  noda) 

:buay-count  (1«  (noda-buay-count  node)) 
:Bathod-cabha  (noda-nathod-cacha  noda) ) ) 

(aatq  *nodaa* 

(copy-raplaca-alt  noda  noda-nr  *nodaa*)) 
(miltipla-valua-bind  (naw-nodaa  naw-atap-quaua) 
(procaaa-iiag  «ag  *nodaa*  *atap-quaue*) 

(aatq  *nodaa*  naw-nodaa 

*atap-quaua*  naw-atap-quaua) ) ) ) ) ) 

(dafun  aand-aag  .(aag) 

(aatq  *atap-quaua*  (anquaua  *atap-quaua*  nag))) 


(cat -error  *-&Undafinad  Claaa  >8*  claaa-naaa) ) ) )  (dafun  cat-atart  (init-aag) 

(aand-aag  init-aag) 

(dafun  gat -object  (id)  (ahall-go)) 

(araf  *objacta*  id)) 

(dafun  ahall-go  () 

(dafun  aag-argn  (n  aag)  (cond  ((atap-dona) 

(nth  n  (aag-arga  aag)))  nil) 

(t  (atap-nodaa) 

(dafun  aag-langth  (aag)  (ahall-go))))) 

(if  (liatp  (aag-arga  aag)) 

{*  4  (length  (aag-arga  aag)))  (dafun  procaaa-aag  (aag) 

5))  (If  ‘profila* 
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(••tq  *nr'>Ma«>r«e«iv«d* 

{*  1  *nr>aags-r«c«iv«d*) ) ) 

(l«t  ((h««d«r  asg) ) ) 

(ca««  h««d«r 

(••nd  (procass-aand  asg) ) 

(call  (procaaa-call  ^ag}) 

(naw  (procaaa-naw  M'^}) 

(nawco  (procaaa-nawcv  nag) ) 

(raply  (procaaa-raply  nag))) 
nil) ) 

;  ;  naw  craataa  a  naw  objact  on  a  noda 

;;;  naw  is  of  the  form  (naw  claaa  raply-contaxt  raply-alot) 
;;;  or  if  the  objact  ia  diatributad,  a  count  nay  ba  appended 
;;;  for  diatributad  objacta,  naw-co  naaaagaa  are  aant  in  a 
;;;  fanout  tree  to  all  conatituanta. 

;;;  <??> 

(dafun  procaaa-naw  (nag) 

(let*  ((claaa>nana  (nag-aalactor  nag)) 

(raply-contaxt  (nag-racaivar  nag)) 

(raply-alot  (firat  (nag-arga  nag) ) ) 

(diet  (claaa-diat  (gat -claaa  claaa-nana) ) ) 

(id  (naw-objact  claaa-nana  (nag-noda  nag)))) 

(if  diat 

(let  ((size  (aacond  (nag-arga  nag) ] ) ) 

(init-diatributad-objact  id  aiza  (nag-noda  nag) 

reply-context  raply-alot)) 
(raply-to-contaxt  raply-contaxt  r^ly-alot  id) ) } ) 


(dafun  raply-to-contaxt  (contaxt-nr  alot  value) 

(let  ((nag 

(naka-nag  :noda  (contaxt-to-noda  contaxt-nr) 

: header  'reply 
:aalactor  contaxt-nr 
: receiver  alot 
:arga  (liat  value)))) 

(aand-nag  nag))) 

;;;<??>  handle  did  raeaivar 

aand  craataa  a  naw  context  and  axacutaa  the  firat  atatanant 
;;;  if  receiver  ia  not  at^ic,  look  up  claaa 

;;;  Ida  are  referred  to  like  '(id  3)  to  diatinguiah  than  fr<»  the  integer  3 

(dafun  procaaa-aand  (nag)  . 

(let*  ((receiver  (nag-racaivar  nag) ) 

(node  (nag-node  nag) ) ) 

(cond  ((ia-did  receiver) 

(let*  ((id  (did-on-noda  receiver  node))) 

(if  id 

(procaaa-nozmal-aand  nag  id) 

( f orward-did-naaaage  node  nag  receiver)})) 

{(ia-co  receiver) 

(let  ((id  (did-on-node  '(did  , (aacond  receiver))  node})> 
(procaaa-nomal-aend  nag  id})) 

((ia-block  receiver) 

(procaaa-block-aand  nag) ) 

(t 

(procaaa-nomal-aend  nag  receiver)))}) 


(defun  init-diatributad-objact  (id  aiza  noda  reply-context 

raply-alot) 


(let*  ((aiza  (if  aiza 

(nin  aiza  *nr-nodea*) 
dafault-diatobj-aiza*) ) 

(did  (new-did  node  aiza))) 

(send-diat-init  node  id  did  0  aiza  noda  raply-contaxt 
raply-alot) ) ) 


(defun  aand-diat-init  (node  id  did  index  aiza  root  reply-context 
reply-alot) 


(let  ((nag 

(Mke-nag  :noda  noda 

s header  'aand 
:aalaetor  'nawco 
: receiver  id 
:arga 

(liat  index  aiza  root  raply-contaxt  raply-alot))) 
(objact  (gat-objact  (raf-id  id) ) ) ) 

(aatq  *objacta* 

( c^?y-raplaca-alt 

(ziaka-objact  :id  (object-id  objact) 

:did  did 


:noda  (object-node  object) 
:claaa  (objact-claaa  objact) 
:atata  (objact-atata  object) 
:ivar  (objact-ivar  object)) 

(raf-id  id) 

*objacta*) ) 

(aand-nag  nag))) 


;;;  the  nawco  naaaaga  ia  a  hack  to  allow  diatributad  objact  to  ba 
;;;  created. 


(dafun  procaaa-nawco  (nag) 

(let*  ((claaa-naiM  (nag-aalactor  nag) ) 

(did  (nag-racaivar  nag) ) 

(index  (firat  (nag-arga  nag) } } 

(aiza  (aacond  (nag-arga  nag))) 

(root  (third  (nag-arga  nag)}) 

(reply-context  (fourth  (nag-arga  nag) ) ) 
(raply-alot  (fifth  (nag-arga  nag) ) ) 

(id  (naw-objact  claaa-nana  (nag-noda  nag)))) 
(aand-diat-init  (nag-noda  nag)  id  did  index  aiza 
root  raply-contaxt  raply-alot) } ) 

;;;  on  a  raply,  atuff  data  into  alot  and  raauna  context 
;;;  naaaaga  ia  (raply  contaxt-nr  alot -nr  data) 

;;;  if  value  ia  a  value,  nuat  allocate  copy 

(dafun  procaaa-raply  (nag) 

(let*  ((contaxt-nr  (nag-aalactor  nag) ) 

(alot  (nag-racaivar  nag) ) 

(data  (firat  (nag-arga  nag) ) ) 

(context  (gat -context  contaxt-nr) ) ) 

(if  context 
(progn 

(aat-alot  alot  context  data) 

(raauna -context  contaxt-nr))))) 

; ; ;  coda  to  aand  a  raply 


(dafun  procaaa-nomal-aend  (nag  receiver) 

(let*  ((aalactor  (nag-aalactor  nag) ) 

(arga  (nag-arga  nag))) 

(if  (ia-id  receiver) 

(let*  ((id  (aacond  receiver)) 

(obj  (gat-objact  id)) 

(claaa-nana  (objact-claaa  obj)} 

(coda  (nath  ^d-lookup  aalactor  claaa-nana) } } 
(atart-coda  coda  nag  receiver  arga)) 

(let*  ( (claaa-nane 

(cciid  ((intagerp  receiver)  'integer) 

((floatp  receiver)  'float) 

( (aynbolp  receiver)  'aynbol))) 

(coda  (nathod-lookup  aalactor  claaa-nana) ) ) 
(atart-coda  coda  nag  receiver  arga))))) 

(dafun  forward-did-naaaaga  (noda  nag  receiver) 

(aetq  nag 

(naka-nag  :noda  (id-to-noda  raeaivar) 

: header  (nag-haadar  nag) 

:aalactor  (nag-aalactor  nag) 
tracaivar  (nag-racaivar  nag) 

:arga  (nag-arga  nag))) 

(aand-nag  nag) } 

(dafun  procaaa-block-aand  (nag) 

(let  ((block  (gat-block  (blkid-gat-id  (nag-racaivar  nag)))) 
(aalactor  (nag-aalactor  nag) ) 

(arga  (nag-arga  nag) ) ) 

(if  <aq  aalactor  'value) 

(atart-coda  block  nag  nil  arga) 

(cat -error  *-fcBlock  naaaaga  other  than  value  -S*  nag)})) 

(dafun  atart-coda  (coda  nag  receiver  arga) 

(if  coda 

(let  ((nr-arga  (block-nr-arga  coda))) 

(cond  {(*  (♦^  nr-arga  2) 

(length  arga}) 

(atart-nathod  (nag-noda  nag)  coda  receiver  arga) } 

(t 

(progn 

(cat -error  *-liWrong  nunbar  of  argunenta  in  -S*  nag) 
(cat-error  actuala,  to  natch  fomala* 

arga  nr-arga) )))))) 

;;;  create  a  context,  copy  arga  fron  naaaaga,  execute  to  firat  aand 

(defun  atart-nathod  (noda  coda  receiver  arga) 

(let  ((contaxt-nr  (ref-id  (naw-contaxt  noda  coda  receiver)))) 
(copy-arga  arga  context -nr) 

(advance-context  contaxt-nr) ) ) 

(dafun  copy-arga  (arga  contaxt-nr) 

(let  ((context  (gat-context  contaxt-nr})) 

(let  ((arg  nil) 

(i  0)} 

(copy-arga-l  arg  arga  i  context) )}} 

(dafun  c^y-arga-1  (arg  arga  i  context) 

(cond  ((null  arga) 
nil) 

(t 

(aatq  arg  (oar  arga) ) 
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(oiultip2«>valu«-bind  (value  nav-concexC> 
(aec-contexc-slot  context  i  arg) 
(aetq  context  new-context ) ) 

(aetq  args  (cdr  arga)) 

(aetq  1  (It-  1} 

(copy>arga-l  f  .  axga  i  context)))) 

; ; :  advances  context  ver  next  action 


(defun  advance-context  (context-nr) 

(let  ((next  (execute-instruction  context -nr) ) ) 

(urtien  •profile* 

(aetq  *nr-icode8-executed* 

(It  *nr-lcodea-executed*) ) ) 

(when  •method-cache* 

(let*  ((node-nr  (context-node  (get-context  context -nr) ) ) 
(node  (get-node  node-nr)) 

(block  (context -code  (get -context  contexc-nr)))) 

(when  *method-cache-trace* 

(let  ( (prev  (first  *method-cache-trace-liat*) ) ) 

(if  (not  (and  (equal  (first  prev) 

•step-nr*) 

(equal  (second  prev) 
node-nr) ) ) 

(aetq  *method-cache-trace-list* 

(cons  (list  •atep-nr*  node-nr 
(block-id  block) 

(length  (block-inscs  block))) 
•nethod-cache-trace-liat*) ) ) ) ) 

(%dten  (not  (method-cache-present -p 
block 

(node-method-cache  node))) 

(progn 

(aetq  *nr-block8-loaded* 

(It  *nr-blocks-loaded*) } 

(method-cache-inaert  block 

(node-method-cache  node})}))) 


(case  next 
(suspend  nil) 

(back-up  (back-up-context  context -nr)) 

(continue  (advance-context  context -nr)) 

(dispose  (remove-context  context -nr)) 

(otherwise 

(cst-error  **(Illegal  value  in  advance  context :~S* 
next) ) ) ) ) 


;;;  <??>  other  opcodes 

(defun  execute-instruction  (context -ni) 

(let*  ((inst  (fetch-instruction  context -nr) ) 

(opcode  (car  inst))) 

(if  •profile* 

(aetq  •nr-insts-executed* 

(♦  (-  (length  inst)  1) 
•nr-insts-executed*) ) ) 

(execute'instruction-l  inst  opcode  context-nr) ) ) 

(defun  execute-instruction-1  (inst  opcode  context-nr) 
(case  opcode 
(move 

(execute-move  context-nr  inst)) 

((send  csend  forward) 

(execute-send  context-nr  inst) ) 

((falsejua^  jua^) 

(execute-jump  context-nr  inst) ) 

(label 

'continue) 

( (reply  reply-x) 

(execute-reply  context-nr  inst) ) 

((return  retum-x) 

(execute-return  context-nr  inst)) 

;  implement  return  ioodes 
(reply-console 

(execute-reply-console  context-nr  inst)) 
(echo-console 

(execute-echo-console  context-nr  inst)) 

(ne«fco 

(execute-newco  context-nr  inst)) 

(new 

(execute-new  context-nr  inst)) 

(touch 

(execute-touch  context-nr  inst) ) 

(su^end 

'su^psnd) 

(exit 

'dispose) ) ) 

(defun  execute-touch  (context-nr  inst) 

(let*  ((context  (get-context  context -nr)) 

(ref  (second  inst))) 

(if  (equal  (get-slot  ref  context)  'c-fut) 

'beo)c-up 
'continue) } ) 


;;;  sends  away  for  a  new  object 

(defun  execMce-new  (context-nr  mat) 

(let*  ((context  (get-context  context-nr)) 

(class-nasM  (caddr  inst)} 

(deat  (cadr  inst)) 

(size  (get-slot  (cadd^r  mat)  context))) 

(if  (eq  class-name  'array) 

(progn 

(set-slot  dest  context 

new-array  (context-node  context)  size) 

'continue) 

(progn 

(set-slot  dest  context  'c-fut) 

(cst-new  class-name  context-nr  dest  size) 

' suspend) ) ) ) 

;;;  creates  a  constitutent  of  a  distributed  object 

(defun  execute-newco  (context-nr  inst] 

(let*  ((context  (get-context  context-nr)) 

(slot  (cadr  inst) ) 

(args  (mapear  •' (lambda  (x) 

<gct-8lot  X  context)) 

(eddr  inst) ) ) 

(object  (get-object  (ref-id  (context-receiver  context) } ) ) 
(class  (object-class  object)) 

(did  (object-did  object)) 

(msg 

(make-msg  :node  (car  args) 
rheader  'newco 
:8elector  class 
:receiver  did 
:args 

(append  (cdr  args)  (list  context-nr  slot))))) 
(set-slot  slot  context  'c-fut) 

(send-msg  msg)  • 

'continue) ) 

(defun  execute-juop  (context-nr  inst) 

(let*  ((opcode  (car  inst))) 

(case  opcode 
(falsejump 

(if  (eq  (get-slot  (cadr  inst) 

(get-context  context-nr) ) 

' false) 

(do- jump  context-nr  (caddr  inst)) 

'continue) ) 

(juap 

(do-juip  context-nr  (cadr  inst)))))) 

(defun  do-jump  (context-nr  target) 

(let*  ((context  (get-context  context-nr)) 

(code  (block-insts  (context -code  context)))) 

(setq  •contexts* 

(copy-replace -el t 

(make-context  :nr  (context-nr  context) 

:node  (context -node  context) 

:code  (context -code  context) 

:ip  (find- jump-target  code  target  0) 
;state  (cortext-state  context) 

:receiver  (context-receiver  context)) 

context-nr 
•contexts*) ) 

'continue) ) 

(defun  find-jump-target  (code  target  nr) 

(if  code 

(let*  ((stat  (car  code)) 

(type  (car  stat) ) ) 

(if  (and  (eq  type  'label)  (s  (cadr  stat)  target)) 
nr 

(find-jump-target  (cdr  code)  target  (-f  nr  1))}))) 

;  does  a  primop  or  sends  a  message 

(dsfun  execute-send  (context-nr  inst) 

(let*  ((opcode  (first  inst)} 

(context  (get-context  context-nr) ) 

(operation 

(let  ((oper  (third  inst))} 

(if  (symbolp  oper) 
oper 

(get-elot  oper  (get-context  context-nr))))) 

(rargs  (edddr  inst)) 

(reply-to 

(case  opcode 
( (send  csend) 

(cons  context-nr  (second  inst))) 

( forward 

(get-slot  (second  inst)  context))))) 

(basic-send  opcode  context-nr  operation  rarga  reply-to))) 
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if  th«  op«r«tion  is  priultiv*,  do  it  and  continu* 
othsrwiss,  actually  do  a  Mssage  sand 


(dafun  basic-sand  (opcoda  contaxt-nr  ^^ration  rargs  raply-to) 
(lat*  ((contaxt  (gat-contaxt  contaxt-nr)) 

(all-args  (sapcar  IMlaiibda  (x) 

(gat-slot  X  contaxt)) 

rargs) ) 

(noda  (contaxt -noda  contaxt)) 

(dast  (cdr  raply-to)) 

(op  (is-priaitiva  oparation  all-args))) 

(if  (nanbar  'c-fut  all-args) 

'  bac)c-up 
(if  (and  op 

(aqual  (car  r^ly-to)  contaxt-nr)  ] 

(progn 

(sat-slot  dast  contaxt  (apply  op  all-args)) 
'continua) 

(progn 

(cst-sand  noda  (car  all-args) 

oparation  (cdr  all-args) 

(car  raply-to)  (cdr  raply-to) ) 

(casa  opcoda 
(sand 

(sat-slot  dast  contaxt  *c-fut) 

' suspand) 

(csand 

(sat-slot  dast  contaxt  'c-fut) 

'continua) 

( forward 

'continua) ))}))) 

(dafun  axacuta-nove  (contaxt-nr  inst) 

(let*  ((contaxt  (gat-contaxt  contaxt-nr)) 

(dast  (second  inst)) 

(src  (third  inst) ) ) 

(sat-slot  dast  contaxt  (gat-slot  src  context)) 

'continue)) 

; ; ;  Reply  sands  the  result  and  exits  the  contaxt 

(dafun  axacuta-raply  (contaxt-nr  inst) 

(lat*  ((context  (gat-contaxt  contaxt-nr)) 

(reply-context  (contaxt -reply-context  context)) 
(reply-slot  (eontaxt-raply-slot  contaxt) ) 

(value  (gat-slot  (cadr  inst)  context))) 

(if  reply -contaxt 
(casa  reply-context 
(console 

(cst -display  value) ) 

(othaivisa 

(whan  reply-slot 

(raply-to-contaxt  r^ly-contaxt  reply-slot 
value))))) 

'dispose) ) 


(cons  (gat-slot  val  contaxt) 

(axacuta-acho-consola-1  val  vals  context))))) 

;;;  returns  a  nuMrical  offset  into  a  context's  arg/var  list 

(dafun  coaputa-slot  (slot  contaxt) 

(lat  ((type  (car  slot)) 

(index  (cadr  slot)) 

(coda  (contaxt -coda  context))) 

(casa  type 
(var 

(♦  index 
2 

(bloc)c-nr-args  coda) ) ) 

(arg 

index) 

(tai^ 

(♦  index 
2 

(bloc)c-nr-arga  coda) 

(bloc)t-nr-vars  coda) ) ) 

(otherwise 

(cst -error  •-kslot  »ust  be  tasp/  var,  or  arg:  -S*  slot))))) 

;;;  gats  a  slot  a.g.,  (ivar  0) 

;;;  <77>  fix  const  and  global 

(dafun  gat -slot  (slot  context) 

(if  (listp  slot) 

(lat  ((type  (car  slot)) 

(index  (cadr  slot))) 

(ease  type 
(ivar 

(objact-ivar 

(gat -object  (raf-id  (contaxt -receiver  contaxt) ) ) 
index) ) 

((arg  var  tasp) 

(lat  ( (n  (cooputa-slot  slot  context)) 

(context-slot  contaxt  n)))) 

(bloc)c 

slot) 

(global 

(gat -global  index) ) 

(const 
index) ) ) 

(casa  slot 
(self 

(contaxt-raeaivar  context)) 

(group 

(object -did 

(gat -object  (raf-id  ( contaxt -receiver  contaxt) ))) ) 
(requester 

(cons  (contaxt-raply-contaxt  contaxt) 

(contaxt -reply-slot  context)))))) 

;;;  sets  a  slot 


; ; ;  Return  sands  the  result  and  continues  to  run  in  the  contaxt 

(dafun  axacuta-raturn  (contaxt-nr  inst) 

(lat*  ((context  (gat-contaxt  contaxt-nr)) 

(reply-context  (contaxt-raply-contaxt  context)) 

(reply-slot  (contaxt-raply-slot  contaxt) ) 

(value  (gat-slot  (cadr  inst)  context))) 

(if  reply-context 
(casa  reply-context 
(console 

(cst-di^lay  value) ) 

(otherwise 

(whan  reply-slot 

(raply-to-contaxt  reply-context  reply-slot  value))))) 
'continue) ) 

(dafun  axacuta-raply-consola  (contaxt-nr  inst) 

(lat*  ((context  (gat-contaxt  contaxt-nr)) 

(value  (gat-slot  (cadr  inst)  context))) 

(cst -display  value) 

'dispose) ) 

(dafun  axacuta-acho-consola  (contaxt-nr  inst) 

(lat*  ((context  (gat-contaxt  context -nr) ) 

(val-list 

(lat  ({val  nil)) 

(axaeuta-acho-consola-1  val  (rest  inst)  context)))) 
(cst -display-list  val-list)) 

'continue) 

(dafun  axaouta-acho-oonsola-1  (val  vals  context) 

(cond  ((null  vals) 
nil) 

(t 

(satq  val  (car  vals)) 

(satq  vals  (cdr  vals)) 


(dafun  sat-slot  (slot  contaxt  value) 

(lat  ((type  (car  slot)) 

(in^x  (cadr  slot))) 

(casa  type 

((arg  var  tasp) 

(lat  ((n  (ccsputa-slot  slot  context))) 
(Bultipla-valua-bind  (value  naw-contaxt) 

(sat -context-slot  contaxt  n  value) 
value))) 

(ivar 

(lat*  ((id  (raf-id  (contaxt-racaivar  contaxt) ) ) 

(object  (gat-objact  id) ) ) 

(satq  *objacts* 

(copy-raplaca-alt 

(■aka-objact  :id  (objact-id  object) 

:did  (objact-did  object) 

:noda  (objact-noda  object) 

:class  (object -class  object) 

.‘State 

(raplaca-nth  index 

(object-state  object) 
value) ) 
id 

*objacts*} ) 

value) ) 

(global 

(sat -global  index  value)) 

(M) 

'())  ;;  do  nothing  if  it's  nil 

(otherwise 

(cst -error  '-esiot  error  ~S*  slot))))) 

(dafun  raplaca-nth  (n  list  value) 

(cond  ((null  list) 
nil) 

((■  n  0) 
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(eons  v«lu«  (edr  list))) 

(t 

(cons  (car  list) 

(rsplaca-nth  (1-  n) 

(odr  list) 
valus) ) ) ) ) 

;;;  <7?>  -  tanporary  hack  to  ia^leaiant  globals  nsad  to  gsnarats 
; ; :  code  to  send  and  receive 

(defun  set-global  (nasie  value) 

(let*  ((cell  (assoc  nane  *globals*))) 

(if  cell 

(setq  *global8* 

(replace -global 
naste 

(cons  (car  cell)  value) 

*global8*) ) 

(cst-error  *~&unknown  global  ~S*  name) ) ) ) 

(defun  replace-global  (name  cell  globals) 

(cond  ((null  globals) 
nil ) 

( (eql  name  (car  (car  globals) ) ) 

(cons  (cons  name  cell) 

(cdr  globals) ) ) 

(t 

(cons  (car  globals) 

(replace -global  nasie  cell  (cdr  globals)))))) 

(defun  get-global  (name) 

(let*  ((cell  (assoc  nasie  *globals*))) 

(if  cell 
(cddr  cell) 

(cst-error  '•-fcun)cnown  global  -S*  name)))) 

(defun  fetch-instruction  (context -nr) 

(let*  ((context  (get-context  context -nr ) ) 

(ip  (context -ip  context)) 

(inst  (blocK-inst  ip  (context -code  context)))) 

(setq  *contexts* 

(co^-replace-elt 

(make-context  :nr  (context-nr  context) 

mode  (context -node  context) 

:code  (context -code  context) 

:ip  (♦  1  ip) 

’.state  (context-state  context) 

: receiver  (context-receiver  context)) 

context-nr 

*contexts*)) 

inst) ) 

(defun  next-instruction  (context) 

(let  ((ip  (context-ip  context))) 

(block-inst  ip  (context -code  context)))) 

(defun  back-up-context  (context-nr) 

(let*  ((context  (get-context  context-nr)) 

(ip  ( context -ip  context) ) 

(new- ip  (-  ip  1) ) ) 

(setq  *contexts* 

(copy-replace-elt 

(make -context  :nr  (context-nr  context) 

mode  (context-node  context) 
mode  (context -code  context) 

:ip  new-ip 

: state  (context-state  context) 

:recelver  (context -receiver  context)) 

context-nr 
•contexts*) ) 
new-ip) ) 

;  resusies  a  suspended  context 

(defun  resusie-context  (context-nr) 

(advance-context  context-nr) ) 

(defun  init-nodes  () 

(setq  *step-queue*  (suike-queue)) 

(setq  *nodes*  (make-array  *nr-node8*)) 

(let  ((X  0)) 

(init-nodes-1  x  *nr-nodes*) ) ) 

(defun  lnit-nodes-1  (x  n) 

(cond  ((not  (<  x  n)) 
nil) 

(c 

(setq  *nodes* 

(copy-replace-elt  (make-node)  x  *nodes*)) 

(setq  X  (!♦  x) ) 

(init-nodes-1  x  n))}) 


(defun  is-node  (node) 

(node-p  node) ) 

(defun  random-node  () 

(randm  *nr-nodes*)) 

(defun  print -node  (node-nr) 

(let  ((node  (get-node  node-nr))) 

(format  *  standard-output*  '-fcMODE  -S  QUEtJE  -S  OBJECTS  -S  COtrTEXTS  -S' 
node-nr  (node-queue  node) 

(node-objects  node)  (node-contexts  node)))) 

(defun  init-contexts  () 

(setf  *contexts*  (make-array  *init-nr-contexts*  :adju8table  t)) 

(setf  *nr-context8*  *init-nr-context8*) 

(setf  *next -context*  0) 

(setf  *free-context8*  (Mke-stack)) 

(setf  *context-state-resource*  (make-array-reaource) ) ) 

(defun  initial-context  (nr-slots) 

(get -array  *context-state-re8ource*  nr-slots)) 

(defun  context-slot  (context  n) 

(aref  (context -state  context)  n) ) 

(defun  set-context-slot  (context  n  x) 

( let  ( (new-contexc 

(Mke-context  mr  (context-nr  context) 

mode  (context -node  context) 

:code  (context -code  context) 

:ip  (context-ip  context) 

.‘State  (c^y-replace-elt 

X  n  (context-state  context)) 

:receiver  (context-receiver  context) ) ) ) 

(setq  *contexts* 

(copy-replace-elt 
new-context  , 

(context-nr  context) 

•contexts*) ) 

(values  X  new-context))) 

(defun  context-reply-context  (context) 

(context -slot  context 

(block-nr-args  (context -code  context)))) 

(defun  set -context-reply-context  (context  x) 

(set -context-slot  context 

(block-nr-args  (context -code  context)) 

X)  ) 

(defun  context -r^ly-slot  (context) 

(context -slot  context 

i*  1  (block-nr-args  (context-code  context))))) 

(defun  set -context-reply-slot  (context  x) 

(set -context-slot  context 

(■f  1  (block-nr-args  (context -code  context))) 

X)) 

(defun  get-context  (context-nr) 

(aref  *contexts*  context-nr)) 

(defun  context-to-node  (context-nr) 

(context -node  (get-context  context-nr) ) ) 

(defun  find-context  (c-nr  c-list) 

(let  ((context  nil)) 

(find-context-1  context  c-nr  c-list))) 

(defun  fiT>d-context-l  (context  c-nr  c-list) 

(cond  ((null  c-list) 
context) 

(t 

(setq  context  (car  c-list)) 

(cond  ((s  c-nr  (context-nr  context)) 
context) 

(t 

(setq  c-list  (cdr  c-list)) 

(find-context-1  context  c-nr  c-list)))))) 

(dsfun  live-contexts  () 

(let  ( (index  0) 

(limit  (length  *contexts*) ) ) 

(live-contexts-1  index  limit))) 

(defun  live-contexts-1  (index  limit) 

(cond  ((not  (<  index  limit)) 
nil) 

(t 

(setq  index  {1«  index}) 

(let  ( (rest-1 ive-eontexts 

(live-contexts-1  index  limit))) 

(if  (aref  *contexts*  index) 
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(cons  (srof  *cont«xts*  indox) 
ros t - 1 i vo-cont«xt • ) 
rost-livo-contoxts) ) ) ) ) 

(dofun  context 'Mthod  (contoxtj 

(bloclC'Mthod  (block-id  (contoxt-codo  context) )) ) 


A  block  identifier  ebstrection 
;;  e  block  id  is  (block  blksyabol) 


(defun  Mke-blkid  () 

(gensyn  *BLOCK*)} 

(defun  blkid-get-id  (blkid) 

(cedr  blkid) ) 

(defun  is-blkid  (id) 

(equel  (car  id)  'block)) 

(defun  block-nethod  (blkid) 

(let  ( (nethod  nil) 

(Methods  *iiethods*)) 

(block-method-l  Method  Methods  blkid) ) ) 

(defun  block-Method-1  (Method  Methods  blkid) 

(cond  ((null  Methods) 
nil) 

(t 

(setq  Method  (car  Methods)) 

(setq  Methods  (edr  Methods) ) 

(if  (eq  (caddr  Method)  blkid) 

Method 

(bloek-Method-1  Method  Methods  blkid) ) ) ) ) 

; ; ;  returns  the  code 

(defun  Method-lookup  (selector  class-nsMe) 

(let  ((Method  (Method-lookupl  selector  class-naMe) ) ) 

(if  (null  Method) 

(progn 

(for«at  *standard-output* 

■-AMessage  ~8  not  iMpleMented  for  class  -6* 
selector  class-naMe) 

M)) 

Method) ) ) 

(defun  Method-lookupl  (selector  class-naMe) 

(let*  ((class  (get-class  class-naMe})) 

(if  class 

(let*  ((supers  (class-supers  class) ) 

(Methods  (Class-Methods  class) ) 

(Method  (assoc  selector  Methods) ) } 

(if  Method 

(get -block  (caddr  Method) ) 

(if  (or  (not  (listp  a\^rs}} 

(eq  class-naMe  'object) 

(eq  class-nsMe  nil}) 

M) 

(Method-lookupl  selector  (car  supers)))))))) 

(defun  is-id  (ref) 

(and  (listp  ref) 

(eq  (car  ref)  'id))) 

(defun  is-did  (ref) 

(and  (listp  ref) 

(eq  (car  ref)  'did) ) ) 

(defun  is-co  (ref) 

(and  (listp  ref) 

(eq  (car  ref)  'co) ) ) 

(defun  is-block  (ref) 

(and  (listp  ref) 

(eq  (car  ref)  'block))) 

(defun  ref-id  (ref) 

(cadr  ref) ) 

(defun  cst-error  (string  Arest  args) 

(i^ly  I'foraat  *  standard-output*  string  args) 
nil) 

(defun  cst -display-list  (allst) 

(format  *standard-output*  *->A->3D:  *  *step-nr*) 

(let  ((val  nil)) 

(est-display-list-l  val  allot))) 

(defun  est-display-list-l  (val  allot) 

(cond  ((null  allot) 
nil) 


(t 

(setq  val  (car  alist)) 

(setq  alist  (edr  alist)) 

(cst-dim>lay-l  val) 

(cst-di^lay-list-1  val  alist)))) 

(defun  cst-di^lay  (value) 

(foxmat  *standard-output*  *->A>3D:  *  *step-nr*) 

(cat-display-l  value) ) 

(defun  cst-di^lay-1  (value) 

(cond  ((listp  value) 

(let  ((type  (car  value)) 

(index  (cadr  value))] 

(ease  type 
(id 

(foraat  * standard-output*  *  -S*  (get-object  index))) 
(othervi se 

(forMat  *standard-output*  *  -S*  value))})) 

((arrayp  value) 

(display-array  value)) 

(t 

(format  * standard-output*  *  -S*  value)))) 

(defun  display-array  (value) 

(let  ((y  nil) 

(X  0) 

•limit  (length  value})) 

(setq  y  (display-array-l  x  limit  y  value)) 

(format  *standard-output*  *  -s*  (reverse  y) )) ) 

(defun  display-array- 1  (x  linit  y  value) 

(cond  ((not  (<  x  limit)) 

y) 

(t 

(setq  y  (cons  (aref  value  x)  y)} 

(setq  X  ll*  X) ) 

(display-array-1  x  liait  y  value))}) 

;;  statistics  functions 

(defvar  *log-list*  '() 

*Log  of  Messages*) 


log  all  messages  this  step 

(defun  log-step  () 

(setq  *log-list* 

(cons  (list  *step-nr* 

(c<^-list  (queue-list  *  step-queue*) ) } 
•log-list*) ) ) 

(defvar  •trace-list*  '() 

•Messages  we've  recorded*) 


record  traced  messages  this  step 

(defun  record-traced-selectors  (traced) 

(let  ((new-Msgs 

(selectively-copy-traced  traced  (queue-list  *step-queue*) ) } ) 
(when  new-Msgs 

(setq  *trace-list* 

(cons  (list  *step-nr*  new-nsgs) 

•trace-list*))))) 

;;  Filter  out  the  traced  selectors 

(defun  selectively-copy-traced  (sel-list  asglist) 

(let  ((Msg  nil) ) 

(selectively-copy-traced-1  msg  sel-list  asglist))) 

(defun  selectively-copy-traoed-1  (msg  sel-list  asglist) 

(cond  ((null  asglist) 
nil) 

(t 

(setq  Msg  (car  asglist) ) 

(setq  asglist  (edr  asglist)} 

(let  ( (rest-of-result 

(selectively-copy-traced-1  msg  sel-list  asglist))) 

(if  (asMber  (asg-selector  asg)  sel-list) 

(cons  asg  rest-of-result) 
rest-of-result) ) ) ) ) 

(defvar  *nr-asgs-received*  0 

•MuBiber  of  asgs  received  in  the  current  tiae  step*) 

(defvar  *nr-inats-executed*  0 

•Insta  executed,  current  tiae  step*) 

(defvar  *nr-icodes-executed*  0  • 

•Icodes,  current  tiae  step*) 
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(d«fv«r  *nr‘*blockS'>lo*dad*  0 

*Nuab«r  of  Nothod  Cacho  ■!••••,  curront  tiao  atop*  ) 

(dofun  profilo-atop  U 
(aatq  *profila-llat * 

(cona  (aaka-profila-fraM 
•atap-nr* 

(quaua^langth  *at^*quau6*) 

*jir-aaga*’racaivad* 

*nr-inata>axacutad* 

*nr-icodaa-axacutod* 

*nr-blocka-loadod* 

( avg-quaua- 1 angt h ) 

(total>aaaaaga>langth) ) 

*profila-liac*) ) 
laatf  *nr*inata*axacutad*  0) 

(aatf  *nr*icodaa-axacutad*  0) 

(aatf  *nr-bIocka>loadad*  0) 

(aatf  ^nr-aaga-racaivad*  0)) 

(dafun  aaka-profila-fraaa  (tiaa-atap  aaga-naw  aaga>dona 

inata-axac  lcodaa~axac 
bloeka-loadad 
•vg-q-langch  aaga*>wotda) 
(liat  tiaa-atap  maga-naw  aaga-dona 

inata-axac  icodaa-^axac  blocka-loadad 
avg-q-langth  aaga^worda) ) 

(dafun  racord-aaaaaga-quaua-data  () 

(aatq  ^aasaaga-quaua-traca* 

(cons 

(cona  *atap-nr* 

(lat  ((indax  0} 

(limit  *nr-nodaa*) 

(aqlan  0) ) 

(racord-aaaaaga-quaua-data-l 
indax  liait  mqlan) ) ) 

*aaaaaga>quaua-traca*) } } 

(dafun  racord*aaaaaga'^aua-data-l  (indax  linit  mqlan) 

(cond  ((not  (<  indax  limit)) 
nil) 

(t 

(aatq  mqlan 

(lat  ((maaaaga  nil) 

(maaaagaa  (quaua-<liat 

(noda-^aua  (gat-noda  indax) ) ) ) 

(aua  0) ) 

(racord'maaoaga'quaua-data'2  maaaaga  maaaagaa 

aum) ) ) 

(lat  ( (raat>quaua-data  (racord*maaaaga'’quaua>data*l 
(14-  indax)  limit  0) ) ) 

(if  (not  (zarop  mqlan) ) 

(cona  (liat  indax  mqlan) 
ra8t'>quaua-data) 
raat«quaua>data) ) ) ) ) 

(dafun  racord*maaaaga'-quaua-data-2  (maaaaga  maaaagaa  aum) 

(cond  ((null  maaaagaa) 
aum) 

(t 

(aatq  aiaaaaga  (car  maaaagaa)) 

(aatq  maaaagaa  (odr  maaaagaa) ) 

(aatq  aum  (4-  aum  (mag-langth  maaaaga))) 
|racord'maaaa0a-quaua-data>2  maaaaga  maaaagaa  aum)))) 

(dafun  avg^quaua-langth  () 

(lat  ((tql  0)) 

(aatq  tql  (aum-quaua-langtha  0  tql)) 

(/  tql  (array-total-aiza  *nodaa*}})) 

(dafun  aum-quaua-langtha  (x  tql) 

(if  (>8  X  (array-total-aiia  *nodaa*)) 
tql 

(aum-quaua-langtha 

(W  X) 

tql  (quaua-langth  (noda-quaua  (gat-noda  x))))))) 

(dafun  total-maaaaga-langth  () 

(lat  ((aum  0)) 

{ tota 1 -maaaaga- langth- 1 
aum 

(mapcar  t'maaaaga-langth  (quaua-liat  *atap-quaua*) )) )) 

(dafun  total-maaaaga-langth-1  (aum  langtha) 

(cond  ((null  langtha) 

aum) 

(t 

(aatq  aum  <<4  aum  (car  langtha))) 

(aatq  langtha  (odr  langtha)) 

(total-maaaaga-langth-1  aum  langtha)))) 


(dafun  maaaaga-langth  (maaaaga) 

(if  (liatp  (mag-arga  maaaaga) ) 

(4-  3  (langth  (mag-arga  maaaaga))) 
4)) 
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Appendix  C 


The  Gramm2ir  Encoding  the 
Cliche  Library 


This  appendix  contains  the  grammar  that  encodes  our  cliche  library.  It  is  an  extraction  of 
key  parts  of  the  grammar  rules,  showing  their  graph  structure  and  the  documentation  asso¬ 
ciated  with  the  cliches  they  represent.  Due  to  space  limitations,  non-structural  constraints 
are  not  included. 

The  syntax  of  a  grammar  rule  is  as  follows: 

(Oefmle  <lli8  node  type> 

<cliche  nane> 

: RHS-Iode-Types 
<node  label-type  pair8> 

: Edge-Li8t 
<8ource-8ink  pair8> 

: Input-Embedding 
<lh8-to-rh8  mapping8> 

: Output-Embedding 
<lb8-to-rh8  mapping8> 

: St-Thrne 

<lh8-to-lh8  mapping8> 

:L-R-Link  <cliche  relationsliip> 

:Doc 

(<docnmentation  8tring>  <documnntation  argnment8>)) 

The  non-terminal  node  type  of  the  rule’s  left-hand  side  is  given  by  <lhs  node  type>. 
The  name  of  the  cliche  represented  by  this  non-terminal  type  is  given  by  <cliche  name>. 

The  keywords  :RHS-lode-Types  and  :  Edge-List  specify  the  right-hand  side  flow  graph. 
:IlBS-Iode-Type8  describes  the  right-hand  side  nodes.  The  <aode  label-type  paiTs>  is  a 
list  of  pairs  of  the  form  (<node-label>  .  <node-type>),  each  of  which  spedfies  the  label 
of  a  right-hand  side  node  and  its  type.  :  Edge-List  indicates  which  ports  are  connected 
by  a  directed  edge.  The  <8onrce-8ink  pairs>  is  a  list  of  pairs  of  the  form  (<sonrce  port 
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specif  ication>  .  <siak  port  specif  icatio&>),  where  each  port  specification  is  of  the  form 
(<iio<le  label>  <auBieric  port  identifier>). 

The  keywords  :  Input-Eabedding,  :  Output-Eabedding,  and  :St-Thrus  specify  the  embed¬ 
ding  relation  of  the  rule.  The  <lhs-to-rhs  aapping8>  in  the  input  and  output  embeddings 
is  a  list  of  mappings  of  the  form  (<lhs  port  specification>  <rhs  port  specification> 
[<data  part  or  overlay  naae>] ) .  The  pair  of  port  specifications  describes  the  correspon¬ 
dence  between  a  port  on  the  left-hand  side  node  and  a  port  on  a  right-hand  side  node. 
The  <data  part  or  overlay  nane>  is  optional.  It  can  name  either  a  part  of  a  cliched  ag¬ 
gregate  data  structure  or  a  data  overlay.  For  example,  in  the  rule  for  CIS-Extract,  there 
is  the  Ihs-to-rhs  mapping  ((CIS-Extract  1)  (Access-Base  1)  Base).  This  maps  the  Base 
part  of  the  CIS  aggregate  data  structure  represented  by  port  1  of  the  left-hand  side  node 
CIS-Extract  to  port  1  of  the  right-hand  side  node  Access-Base.  An  example  of  a  Ihs- 
to-rhs  mapping  that  includes  a  data  overlay  name  is  found  in  a  rule  for  FIFO-Dequeue: 
( (FIFO-Dequeue  1)  (Extract-CIS-First  1)  Circular-Indexed-Saquence>FIFO).  This  maps 
the  first  ports  of  the  left-hand  side  and  right-hand  side  nodes  to  each  other  and  it  specifies 
that  they  are  related  by  a  data  overlay  that  views  a  Circular-lndexed-Sequence  as  a  FIFO 
queue.  Similarly,  the  <lhs-to-lhs  mapping8>  following  the  :St-Thrus  keyword  is  a  list  of 
mappings  of  the  form  (clhs  input  port  specif  ication>  clhs  output  port  specif  ication> 
C<data  part  or  overlay  naBe>] ) .  Such  a  mapping  specifies  that  the  two  left-hand  side  ports 
correspond,  i.e.,  the  rule  contains  a  st-thru. 

The  <cliche  relationship>  given  with  the  :L-R-Link  keyword  describes  how  the  cliched 
operation  represented  by  the  left-hand  side  node  is  related  to  the  cliched  operation(s)  rep¬ 
resented  by  the  right-hand  side  node(s).  This  information  is  used  in  annotating  the  links 
of  a  design  tree  and  in  generating  documentation. 

The  explanation  fragment  associated  with  a  cliche  is  given  in  the  :Doc  keyword,  whose 
value  consists  of  a  <docnaientation  string>  with  slots  that  are  filled  in  by  the  «locunentation 
argua6iit8>.  The  arguments  are  in  the  form  of  expressions  that  are  evaluated  in  the  context 
in  which  the  right-hand  side  of  the  rule  is  reduced  to  the  left-hand  side  during  parsing. 

If  a  rule  has  been  depicted  in  a  figure  in  the  document,  then  the  figure’s  number  is  given 
in  a  comment  preceding  the  rule.  (There  is  an  index  of  the  list  of  figures  following  this 
appendix.) 

The  grammar  rules  are  followed  by  an  alphabetical  list  of  the  non-terminal  node  types 
and  the  types  of  their  ports.  For  example,  a  node  of  type  ABC,  having  three  ports  of  type 
Integer,  Synbol,  and  Queue,  respectively,  is  listed  as;  (ABC  1: Integer  2:Syabol  3 .-Queue). 
The  number  preceding  each  node  type  specifies  the  page  on  which  the  rules  for  the  node 
type  begin. 
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(D«frul«  SEOUBITIAL-SINULATJQN-OP-NESSAGE'PASSING-SYSTEM 
*S«qu«nti«l  Sinulfttion  of  Parallol  NoiMgo-Possing  Syston* 
:RHS*Nod«>Typ«a 

( (SIMULATE-ASYNCHRONOUSLY  .  EVENT-DRIVEN-SIMULATlOli) ) 

: Input >tob«dding 

( ( (SEQUE>mAL>SIMULAT10N-OP-MBSSA6E-PASSING-SYSTEK  1 ) 
(SINULATE-ASYNCHRONOUSLY  3)) 

(  ( SEQUENTIAL-SIMULATION-OP -MBSSAGE-PASSING-SYSTEM  2) 
(SXMULATE-ASYNCHRONOUSLY  1))) 

: Output -Bnbaddi ng 

( ( (SBQUDmAL-SIMULATION-OP-NESSAGE-PASSlNG-SYSTQf  3) 
(SIHULATE-ASYNCHRONOUSLY  4))) 

:L-R-Link  IMPLEMENTATION 

:00C 

(*8«quantially  siaulatas  «  parallal  Mssaga-passing  ayatan.*)) 

(Oafrula  SEQUENTIAL-SIMULATION-OP-MSSSAGE-PASSING-SYSTEM 
*Saquantial  Siaulation  of  Parallal  Naaaaga-Paaaing  Syataa* 

:  RHS  -Noda  ''Typmm 

( (SIMULATE-SYNCHRONOUSLY  .  SYNCHRONOUS-SIMULATION)) 

:  Input-Biibadding 

(  ( (SEQUENTIAL-SIMULATION-OP-MESSA6E-PASSXN6-SYSTEM  1) 
(SIMULATE-SYNCHRONOUSLY  1)) 

( (SEQUEIPriAL-SXNULATION-OP-MESSAGE-PASSlNG-SYSTBM  2) 
(SIMULATE-SYNCHRONOUSLY  2))) 

: Output -Babadding 

(  (  (SEQUENTIAL-SIMULATION-OP -MESSAGE-PASSING-SYSTEM  3) 
(SIMULATE-SYNCHRONOUSLY  3))) 

:L-R-Link  IMPLEMSPTATION 
:Doc 

! *8aquantially  aimulataa  a  parallal  Maaaaga-paaaing  ayatan.*)) 
;;;  Figure  4-21. 


(Dafrula  EVOIT-DRIVSI-SIMULATIQN 
*Evant-Drivan  Siaiulation* 


:RHS-Noda-Typaa 

(  ( INSERT-INlTIAL-EVEirr  .  PQ- INSERT) 

(GENERATE-EVQ4NODES  .  GENBRATE-EVBrr-QUEUES-AND-NODES) 
(EO-PINISHED?  .  CO-EARLIBST-EDS-PINISHED) ) 


:Bdga-Liat 

( { (INSERT-INITIAL-EVENT  3) 
( (GENBRATE-BVQ4NODBS  4)  . 
( (GDrBRATB-rVQ4NOOES  3)  . 
: Input-Enbadding 
(  (  (EVENT-DRIVEN-SIMULATION 
( (EVENT-DRIVE)<-SIMULATION 
( (EVENT-DRIVQI-SHIULATIQN 


.  (GBIERATB-EVQ4NODES  1)) 
(EO-PIMISHEO?  2)) 
(EO-PINISHED?  1))) 

1)  ( INSERT- INITIAL-BVDPT  1)) 

2)  (INSERT-INITIAL-CVENT  2)) 

3)  (GBNBRATE-EVQ4NODES  2))) 


:  Output -Eadaaddl  ng 

( ( (BVENT-ORIVEN-SIMUIATION  4)  (BD-PINISHBO?  3))) 
:L-R-Lin)C  COMPOSITION 


:Doc 

(*aaynchronoualy  aiwulataa  a  collaction  of  procaaaing  nodaa  - 
))andling  Maaegaa^  uaing  an  avant-drivan  algoritha.  An  ^ 
avant  queue  -'A  of  avanta  ia  aialntalnad.  To  atart,  an  - 
initial  avant  -A  ia  inaartad  in  the  avant -queue.  On  each  -> 
step,  an  avant  ia  pullad  off  and  proeaaaad,  «rtiich  My  * 
craata  new  avanta  to  ba  addad  to  tha  avant -quaua.  - 
Tha  aaynchronoua  nodaa  (which  rapraaant  procaaaing  nodaa)  ~ 
are  collactad  in  an  addraaa-8Mp<  callad  >A.* 
(INFUT-PORT-NAMB>  (DOC-BP>  (SVSNr-ORlVEN-SINULATION  2))) 
(INPUT-PORT-NANE>  (DOC-BP>  (BVBfr-I»IVEN-SINULATIGN  1))) 
(INPUT-PORT-NANB>  (DOC-BP>  (EVENT-ORIVIN-SINULATION  3))))) 


; ; ;  Pigura  4-21 . 


(Oafrula  GBNERATE-EVBIT-QUEUES-AND-NODBS 
*Ganarata  Evant  Quauaa  and  Nodaa* 

:RHS-Noda-Typaa 

( (BVENT^NOOS-GEN-P  .  DEQUEUE-AND-PROCESS -GENERATION) ) 

: Input-Enbadding 

( ( (GENERATB-SVEirr-QUBUES-AND-NOOBS  1)  (EVENT^NODE-GEN-F 
( (GENBRATE-BVENr-QUBUBS-AND-NOOES  2)  (EVENT^NODB-GEN-P 
:  Output -tobadding 

( ( (GBNBRATE-BVENr-QUEUES-AND-NODES  3}  (EVENT-cNODB-GEN-P 
( (GB«ERATE-BVENT-QUBUE5-AIR>-N0DES  4)  (EVBIT^NOOB-GEN-P 
:L-R-Unk  TEMPORAL-ABSTRACTION 


D) 

2)>) 

3) ) 

4) )> 


:Doc 

(*ganarataa  avant  quauaa  and  addraaa-Mpa  by  rapaatadly  - 
daquauing  tha  currant  avant  quaua  and  procaaaing  tha  avant  - 
daquauad.  Procaaaing  an  avant  cauaaa  naw  avanta  to  ba  « 
addad  to  tha  avant  quaua  and  a  naw  addraaa-Mp  to  ba  - 
craatad.  Tha  initial  avant  quaua  ia  -A  and  tha  initial  > 
addraaa-Mp  ia  -A.-%- 

Tha  outputa  of  thia  oparation  ara  2  aariaa:-!- 

ona  ia  tha  aariaa  of  avant  quauaa  and  tha  othar  ia  tha  - 

aariaa  of  addraaa-Mpa  craatad.* 

(INPUT-PORT-NAME> 

(DOC-BP>  (ODIERATB-EVBIT-aUEOBS-AND-NODES  1))) 

( INPUT-  PORT-MAMl> 

(OOC-BP>  (QB4BRATB-EVENT-aUEDBS-AND-N0DBS  2) ) ) ) ) 


; ; :  Pigura  4-21 . 


(Dafrula  DBQUEUB-AND-PROCBSS-GBNERATION 
*Daquaua  and  Procaaa  Ganaration* 

: RHS -Noda -Typaa 
((DQ-EVEMT  .  PQ-EXTRACT) 

(PROCBSS-THE-BVENT  .  PROCESS-EVENT)) 

:Edga-Liat 

(((OQ-EVEMT  3)  .  (PROCESS-THE-EVENT  2)  ) 

((DQ-EVfirr2)  .  (PROCESS-THE-EVE21T  1) ) ) 

:  li^t-bibadding 

( ( (OEQUBUB-AND-PROCESS-GDIERATION  1)  (DQ-EVENT  1)) 
((DEQUEUE-AND-PROCESS-GENERATION  2)  (PROCESS-THE-EVENT  3)}) 
:St-Thrua 


( ( (OBQUEUE-AND-PROCBSS-GENERATION  2j  (DEQUEUE-AND-PROCESS-GENERATION 
(  (OEQUEUE-AND-PROCESS-GEMERATION  1 )  (DEQUEUE-AND-PROCESS-GENERATION 
:L-R-Lin)i  COMPOSITION 


4)  } 
3)  )  ) 


(*daquauaa  tha  avant  quaua  -A  and  procaaaas  the  evant  dequeued, 
using  tha  addraaa-Mp  -A.* 

(IMPOT-PORT-NAMB>  (DOC-BP>  (DBOUEUE-AND-PROCESS-GENERATION  1))) 
(INPOT-PORT-NAIfE>  (DOC-BP>  (DEQUEUE-AND-PROCESS-GCNERATION  2}  ) )  )  ) 

;;;  Pigura  4-22. 

(Dafrula  CO-EARLIBST-EDS-PINISHED 

*Co-Barliaat  Event-Driven  Simulation  Piniahad* 

:  RHS -Noda -Type  a 

(tSOS-PINlSHBO?  .  CO-XTERATIVE-BOS-PINISHED) ) 

:  Input-bibadding 

(((CO-BARLIBST-BDS-PINISHBO  1}  (BDS-PIN15HBD?  D) 
((CO-EARLIBST-EDS-PINISHED  2)  (EDS-PINISHBD?  2))) 

:  Output  -  bibaddi  ng 

(((CO-EARLIBST-EDS-PINISHED  3)  (BDS-PINISHBD?  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 


(*takaa  a  aaquanca  of  avant-quauaa  and  a  aaquence  of  addraaa-Mpa  and  -%- 
ratuma  tha  addraaa-Mp  in  tha  aaquanca  of  addraaa-Mpa  that  -4- 
corra^onda  to  tha  firat  empty  avant -queue  in  the  aequenee  of  -4- 
avant-queuea . * ) } 

;;;  Pigure  4-22. 

(Dafrula  CO-ITBRATIVE-BDS-PINISNED 

*Co-Itarativa  Evant -Orivan  Simulation  Piniahad* 

:RHS-Noda-TVP«* 

(  (TERMIMATB-EDS?  .  PQ-EMPTY)} 

:  Input-SMoadding 

(((CO-ITBRATIVS-EDS-PINISHED  1)  (TERMINATB-EOS?  1))) 

$st-Ttima 

{((CO-ITBRATIVE-EOS-PINISHBD  2)  (CO-ITBRATIVE-EDS-PINISHED  3))) 

$L-R-Link  COMPOSITION 
:Doc 

("taniinataa  tha  aimulation  whan  tha  currant  avant -quaua  (-A)-4- 
ia  empty,  returning  tha  currant  value  of  tha  addraaa-aMp  (>A) .-4- 
Tha  avant -quaua  ia  implamantad  aa  a  Priority  Queue.* 

(INPUT-PORT-NANB>  (DOC-BP>  (CTO-ITERATIVB-BDS-PINISHBD  1))) 
(INPUT-PORT-NANB>  (DOC-BP>  (CO-ITERATIVE-BDS-PINI5HED  2))))) 

;;;  Pigura  4-24. 


(Dafrula  PROCBSS-BVDfr 


*Procaaa  Event* 


:  RHS -Noda -lypaa 

((GET-DB5T  .  LOOKUP-DESTINATION) 
(TINB-UPDATB  .  UPDATB-NODE-TINB) 
(RBCORO-OB8T  .  RBCORD-AT-OBSTINATION) 
(PR0CB8S-THB-MS6  .  HANDLE-MESSAGE)) 


:Edga-Liat 

(((GVr-DBST  3)  .  (TINB-UPOATB  1)) 


( (TIMB-OPOATB  3)  . 
( (RBC0RD-DE5T  4)  . 
:  li^t  -bibadding 
(((PROCBSS-BVENT  1) 
OBJECT) 

((pROCBSS-EVBPr  1) 
OBJECT) 

( (PROCBSS-EVBIT  1) 
OBJECT) 

( (PROCBSS-EVBIT  1) 
TINE) 

( (PROCESS-EVBPT  2) 
((PROCESS-EVENT  3) 
( (PROCESS-BVBTT  3) 
:Output-^badding 
(((PROCBSS-EVBPT  4) 
((PROCBSS-BVBfT  5) 


(RBCORD-DEST  1)) 
(PR0CBSS-THB-N8G 

( PROCESS -THB-MSG 

(RBCORD-DEST  2) 

(GBT-DEST  2) 

(TZNB-UPDATB  2) 

( PR0CBSS-THB-II86 
(RBCORD-DEST  3)) 
(GBT-DEST  1))) 

( PROCBSS-THE-MSO 
( PROCESS  -THB-l»0 


:L-R-Link  COMPOSITION 


2))) 

1) 


3) ) 

5)) 

4) )) 


<*procaaaaa  tha  avant  -A  whoaa  object  -A  ia  a  Naaaaga,>4-' 
uaing  tha  aaynchronoua  noda  that  ia  tha  daatinatlon  of  tha  Maaaga.-4- 
Plrat  tha  tiM  of  thia  noda  la  updated  with  raapact  to  tha-4- 
tiM  of  tha  avant'a  object  -A.  Ilian  tha  noda-4- 
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h«ndl««  th«  Mstag*,  creating  •  n«w  aiddr^ss-iiap  ftnd  •v«nt  - 
qvMu*.  * 

(lNPOT-l>ORT-NAME>  (DOC-BP>  (PROCESS>EVENT  1))) 
(INPUT-PORT-NMCE>  (DOC-B^  (PROCSSS-EVCMT  1)  OBJECT)) 
(INPOT-PORT-NA|IB>  (DOC'-BP>  (PROCESS-EVENT  1}  TINE)))) 

; ; ;  Pigur*  4-26. 

(D«frul«  UPDATB-NODE-TIME 
•Updat*  Nod*  Tin** 

: RHS -Nod* -Typos 
(  (FIND-WOC  .  MAX) } 

: I npu c - Emb*dd i ng 

( ( (UPDATE-NOOE-TIME  1)  (FIND-MAX  Ij 
TINE) 

( (UPDATE-NODE-TIME  2)  (FIND-MAX  2) ) ) 

:  Output -BBi)»*dding 

(( (UPDATE-NODE-TIME  3)  (FIND-MAX  3} 

TIME)  J 
:St-Thrus 

(((UPDATE-NODE-TIME  1)  (UPDATE-NODE-TIME  3) 

MEMORY) ) 

;L-R-Lin)t  COMPOSITION 
:Ooc 

("updates  the  time  of  the  asynchronous  node  ~A-%-' 
to  b*  the  maximum  of  its  current  time  >A<-%- 
and  the  input  time  -A." 

(INPUT-PORT-NANE>  (OOC-BP>  (UPDATE-NOOE-TIME  1))) 

( INPUT-PORT-NAME>  (OOC-BP>  (UPDATE-NOOE-TIME  1)  TIME)) 
(INPVr-PORT-NAME>  (DOC-BP>  (UPDATE-NOOE-TIME  2))))) 

(Defrule  LOCAL-BUFFER-NQ 
'Local  Buffer  Enqueue* 

: RHS-Node-Types 

( (BUFFER-NSG-LOCALLY  .  FIFO-ENQUEUE) ) 

: Input -Embedding 

(((LOCAL-BUFFER-NQ  1)  (BUFFER-MSG-LOCALLY  1)) 
((LOCAL-BUFFER-NQ  2)  (BUFFER-MSG-LOCALLY  2) 

LOCAL-BUFFER) ) 

: Ou t pu t - Bmbedd i ng 

(((LOCAL-BUFFER-NQ  3)  (BUFFER-MSG-LOCALLY  3) 

LOCAL-BUFFER) ) 

: St -Thru* 

(((LOCAL-BUFFER-NQ  2)  (LOCAL-BUFFER-NQ  3) 

MEMORY) ) 

:L-R-Lin)^  COMPOSITION 
:Doc 

('enqueues  the  Message  -A  on  the  local  buffer  of  the  - 
synchronous  nod*  -A.' 

(ZNPUr-PORT-NAKE>  (OOC-BP>  (LOCAL-BUFFER-NQ  1))) 
(INPUT-PORT-NAKE>  (OOC-BP>  (LOCAL-BUFFER-NQ  2))))) 


Destination  Address  part  of  message  -A  in  the  global  address-map  - 
>A.  It  then  creates  a  new  node  w/  the  message  on  the  front  of  the  - 
new  node's  local  buffer.  The  new  nod*  is  added  to  the  global  ~ 
address-map.  * 

(lNPtrr-PORT-NAKE>  (DOC-BP>  ( LOOKUP-NODE^NQ^UPOATE  1))) 
(INPUr-PORT-NAME>  (DOC-BP>  (LOOKUP-NODE^NQ^UPDATE  2) )  )  )  ) 

(Defrule  DELIVER-MESSAGE 

'Deliver  Message* 

:  RHS -Nod* -Type  S 

(  (MAXE-DELIVERY  .  LOORUP-NODE-t>NQ4>UPDATE) ) 

:  Input -Emlaedding 

(( (DELIVER -MESSAGE  1)  (MAKE-DELIVERY  Ij) 

((DELIVER-MESSAGE  2}  (MAKE-DELIVERY  2>}) 

:St-Thrua 

( ( (DELIVER-MESSAGE  2)  (DELIVER-MESSAGE  3})) 

:L-R-Llnk  IMPLEMENTATION 
:t)oc 

('iteratively  delivers  the  message  -A  to  the  node  addressed  by  the-l- 
message's  Destination-Address  part.' 

(INPUT-PORT-NAME>  (DOC-BP>  (DELIVER-MESSAGE  1))})) 

(Defrule  DELIVER-MESSAGE-ACCUMULATE 
'Deliver  Message  Accumulate* 

:RHS-Node-Types 

( (THE'DELIVERY  .  DELIVER-MESSAGE)) 

:Inpuc-Embeddina 

(((DELIVER-MESSAGE-ACCUMULATE  1)  (THE-DELIVERY  1)) 
((DELIVER-MESSAGE-ACCUMULATE  2)  (THE-DELIVERY  2)}) 

:  Output  -  Enbeddi  ng 

( ( (DELIVER-MESSAGE-ACCUMULATE  3 )  (THE-DELIVERY  3 ) ) ) 

:L-R-Lin)C  TQCPORAL-ABSTRACTION 
:Doc 

('accumulates  the  new  nodes  created  by  delivering  the  message  in  the-%- 
series  from  -A  into  a  new  address-map  ^A.* 

(INFUT-PORT-NAM£>  (DOC-BP>  (DELIVER-MESSAGE-ACCUMULATE  1))) 
(INPUT-PORT-NAME>  (DOC-BP>  (DELIVER-MESSAGE-ACCUMULATE  2)  }})  ) 

(Defrule  ENUMBRATB-AND-DELIVER-MESSAGES 
'Enumerate  and  Deliver  Messages* 

:  RHS -Node -Type  8 

( (EKUMERATE-MESSAGES  .  DESTRUCTIVE-QUEUE-ENUMERATION) 
(DELIVER-THE-MESSAGES  .  DELIVER-MESSAGE-ACCUMULATE)  ) 

: Edge -Li St 

( ( (ENUMERATB-MESSACES  2)  .  (DELIVER-THE-MESSAGES  1))) 

:  Input  -Babeddi  ng 

(((ENUMERATE-AND-DELIVER-MBSSAGES  1)  (QIUMERATE-MESSACES  1)) 
((ENUMERATE-AND-DCLIVER -MESSAGES  2)  (DELIVER-THE-MESSAGES  2))) 

:Output -Embedding 

(((ENUNERATE-AND-DELIVER -MESSAGES  3)  (DELIVER-THE-MESSAGES  3)}) 
:L-R-Lin1C  COMPOSITION 


;;;  Figure  5-5. 

(Defrule  LOCAL-BUFFER-DQ 
'Local  Buffer  Dequeue* 

:RHS-Node-Typea 

( (EXTRACT-MSG  .  PIPO-DEQUEUE) ) 

: 1 nput - Embeddi ng 

(((LOCAL-BUFFER-DQ  1)  (EXTRACT-MSG  1) 

LOCAL-BUFFER) ) 

: Out put -Bnbeddi ng 

(( (LOCAL-BUFFER-DQ  2)  (EXTRACT-MSG  2) ) 

((LOCAL-BUFFER-DQ  3)  (EXTRACT-MSG  3) 

LOCAL-BUFFER) ) 

: St -Thru* 

(((LOCAL-BUFFER-DQ  1)  (LOCAL-BUFFER-DQ  3) 

MEMORY)) 

:L-R-Lin)c  COMPOSITION 

:DOC 

('dequeue*  the  first  message  (if  any)  from  the  local  buffer  - 
of  the  Synch-Node  -A.* 

(INPOT-PORT-NAME>  (DOC-BP>  (LOCAL-BUFFER-DQ  1))))) 

(Defrule  LOOKUP-NODS^NQ^UPDATB 

'Lookup  Node,  Enqueue  Message,  and  tq^t*  Node  Map* 

:  RHS -Node -Type* 

((LOOKUP-DEST-NOOE  .  LOOKUP-DESTINATION) 

(NQ-MSG  .  LOCAL-BUFFER-NQ) 

(UPDATE-MAP  .  RECORD-AT-OESTINATION) ) 

: Edge-List 

(((LOOKUP-DEST-NOOE  3)  .  (NQ-MSG  2)  ) 

( (NQ-MSG  3 )  .  (UPDATE-MAP  1 ) ) ) 

:  Input-Bitbedding 

(((LOOKUP-NODE^NQ^UPDATE  1)  (UPDATE-MAP  2}) 

( (LOOKUP-NODBfNQ^UPDATE  1)  (NQ-MSG  1)) 

(  (L00KUP-N0DBfND4>UPDATB  1)  (LOOKUP-DEST-NOOE  2)) 

( (LOOKUP-NOOB^NQ^UPDATE  2)  (UPDATE-MAP  3)) 
((LOOKUP-NODB<»NQ«UPDATB  2}  (LOOKUP-OEST-NODB  1))} 

: Output -Bmbsddl ng 

(( (LOOKUP -N0DE4NQ«UP0ATB  3)  (UPDATE-MAP  4))) 

:L-R-Link  COMPOSITION 
:Doe 

('looks  up  the  synchronous  nod*  et  the  addrees  in  the  - 


('enumerate*  Che  messages  in  Che  global  message  buffer  -A  • 
and  delivers  each  one  to  the  nodes  addressed  by  Che  message's  - 
Deatination  Address  pert.  The  new  nodes  created  during  delivery 
are  accumulated  into  a  global  address-map,  implemented  as  a  - 
sequence,  whose  initial  value  is  -A.-'k- 
The  new  (accumulated)  global  eddreas-map  is  returned.* 
(INFVr-PORT-NANE>  (DOC-BP>  (BNUNSRATE-AND-DELIVER-MESSAGBS  1))) 
(XNPlFr-PORT-NAME>  (DOC-BP>  (ENUNSRATE-AND-DELIVER-ME5SAGES  2))))) 

(Defrule  DELIVER-MESSAGES 
'Deliver  Messages* 

zRHS-Node-Types 

((SNUNERATE-AND-OELIVER  .  BNUMERATE-AND-DELIVER-MESSAGES)  ) 

:  I  npu  C  -  bibedd  i  ng 

(((DELIVER-MESSAGES  1)  ( ENUMERATE -AND-DELIVER  1)} 

((DELIVER -MESSAGES  2}  (INUMERATE-AND-DELIVER  2)  )  ) 

:  Output -bibaddi  ng 

(((DELIVER-MESSAGES  3)  (ENUNERATE-AND-DBLIVER  3))) 

;L-R-Link  INPLENSFrATION 
:Doc 

('delivers  the  massages  in  the  gloI»al  SMSsage  buffer  -'A,  creating  •>%- 
new  nodes,  which  ere  accumulated  into  e  global  eddress-mep  -4- 
«diose  initial  value  is  -'A.* 

(INPUr-PORT-NANE>  (DOC-BP>  (DELIVER-MESSAGES  1))) 

(INPUT-PORT-NANE>  (DOC-BP>  (DELIVER-MESSAGES  2))))) 

(Defrule  LOCAL-BUFFER-EMPTY? 

'Local  Buffer  bpty  Teat* 

:  RMS-Noda -lyp*  * 

((CHECK -BUFFER  .  FIFO-BIPTY?}  ) 

:  If^t-mbadding 

(((LOCAL-BUPFBR-EMFTY?  1)  (CHECK -BUFFER  1)  LOCAL-BUFFER)) 

:L-R-Link  COMPOSITION 
:Doc 

('testa  whathar  tha  local  buffar  of  synchronous  node  ••A  is  empty.* 
(INFUT-PORT-NANE>  (DOC-BP>  (LOCAL-BUFFER-SfPTY?  1)})}) 

(Defrule  LOCAL-BUFFER-NONBMPTY? 

'Local  Buffer  Nonempty  Test* 
tRKS-Noda-Typaa 
((CHECK-BUFFER  .  FIFO-EMPTY?)) 
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: Input -Bnb«dding 

( ( (LOCAL-BUPPBR-NONBMPTY?  1}  (CHBCK-BUPFER  1) 

LOCAL-BOPPER) } 

:L>R>Llnlc  COMPOSITION 
:Doc 

vrticther  th«  local  buffer  of  aynchronous  node  ~A  is  ~ 
noneapty . * 

(INPOT-PORT-NAME>  (OOC-BP>  (LOCAL-BUPPER-NONBIPTy?  1)}}}) 

(Defrule  LOCAL-BUFFERS^ALNAYS-EMPTy? 

■Local  Buffer  Always  Ei^ty  Test* 

: RHS -Node -Types 

( (CONTINUOUS-CHECK  .  LOCAL-BUPPER-MONaCPTY?)  J 
: Input-Embedding 

( ( (LOCAL-BUFPERS-ALHAYS-EKPTY7  1)  (CONTINUOUS-CHECK  1))} 

:L-R-LinK  TEMPORAL-ABSTRACTION 

:Doc 

(■continually  checks  that  each  node  in  the  input  aeries  of  ~ 
nodes  -A  has  an  oqpty  local  buffer.* 

(INPOT- PORT-NAME>  (DOC-BP>  (LOCAL-BUFPERS-AlMAYS-EMFTy?  1})))) 

(Defrule  ENUM-NODES^CHECR-BUFPBRS 
■EnuMerate  Nodes  and  Check  Buffers* 

: RHS-Node-Types 

( (ENUNERATE-NODES  .  SEQUENCE-ENUMERATION) 

(BUPPER-ALWAYS-EMPry  .  LOCAL-BUPPERS-ALUAYS-EMPTY?) ) 

:Edge-List 

( ( (ENUMERATB-NODES  2)  .  (BUFPER-ALHAYS-EMPTY  1))) 

: Input-Enbedding 

{ { (ENUN-NODES+CHECK-BUFFERS  1}  ( EMUMERATB-NOOES  1))) 

:L-R-Link  COMPOSITION 
:Doc 

(■enumerates  the  sequence  of  nodes  -A  and  checks  that  each  - 
node  has  an  empty  local  buffer.* 

(INPOT-PORT-NAME>  (OOC-BP>  (EMUM-NODES«CHECK-BUPPBRS  1))))) 


:  I  nput  -  bbeddi  ng 

( ( (EXTRACT-AND-HANDLE-PIRST-MESSAGE  1} 
((CXTRACT-AND-HANDLE-FIRST-MESSAGE  1) 
((EXTRACT-AND-HANDLE-PIRST-NESSAGE  2) 
( (EXTRACT-AND-HANDLE-FIRST-MESSAGE  3] 
( (EXTRACT-AND-HANDLE-FIRST-MESSAGE  4) 
:  OutSKit -EMbeddi  ng 

( llEXTRACT-AND-HANDLE-FIRST-MESSAGE  S) 
((EXTRACT-AND-HANDLE-FIRST-MESSAGE  6) 
:St-Arus 


(EXTRACT-FIRST-MSG  1)) 
(HAS-WORK?  D) 
(RECORO-WORKING-NODE  2]) 
(RECORD-WORKING-NODE  3}J 
(HANDLE-THE-HESSAGE  3) ) ) 

(HANDLE-THE-MESSAGE  4)) 
(HANDLE-THE-MESSAGE  S)J) 


(((EXTRACT-AND-HANDLE-PIRST-MESSAGE  4) 
(EXTRACT-AND-HANDLE-FIRST-MESSAGE  6)) 
{(EX71iACT-AND-HANDLE-PlRST-MESSAG£  3) 
(EXTRACT-AND-HAMDLE-PIRST-MESSAGE  5})} 
:L-R-Link  COMPOSITION 


:Ooc 

(■extracts  the  first  message  from  the  local  buffer  of  synchronous  node>%~ 
-A  if  the  node  has  isork,  i.e.,  messages  queued  up.  'Aie  message  2  8-%- 
then  processed,  itfhich  may  generate  new  messages.  The  new  messages  -%- 
are  collected  on  the  message  queue.* 

(INPOT-PORT-NANB>  (DOC-BP>  (EXTRACT-AND-HANDLE-FIRST-NESSAGE  1))))) 


(Defrule  DO -WORK -ACCUMULATION 
*Do  Work  Accumulation* 


:RHS-Node-Type8 

( (EXTRACT-AND-HANDLE  .  EXTRACT-AND-HANDLE-PIRST-MESSAGE)  ) 


:  iTiput-Bmbedding 
(( (DO-WORK -ACCUMULATION  1) 
((DO-WORK -ACCUMULATION  2) 
( (DO-WORK -ACCUMULATION  3) 
( (DO-WORK -ACCUMULATION  4) 
:$t-Thrus 

((  (DO-WORK -ACCUMULATION  4) 
( (DO-WORK -ACCUMULATION  3) 
:L-R-Link  COMPOSITION 


(EXTRACT-AND-HANDLE  1)) 
(EXTRACT-AND-HANDLE  2)) 
(EXTRACT-AND-HANDLE  3}] 
(EXTRACT-AND-HANDLE  4)  ]  ) 

(DO-WORK-ACCUMULATtON  6)) 
(DO-WORK-ACCUMULATION  5))) 


(Defrule  LOCAL-BUPPERS-EKPTY? 

■Local  Buffers  Ea^dty* 

: RHS-Node-Types 

( (CHECK-ALL-NODE-BUPPERS  .  EMUM-NODES^CHECK-BUPPERS) ) 

: Input-Embedding 

( ( (LOCAL-BUFPERS-EMPTY?  1)  (CHECK-ALL-NODE-BUPPERS  1))) 

:L-R-Link  IMPLEXEMTATION 
:Doc 

(■checks  that  all  nodes  in  -A  have  an  empty  local  buffer.* 
(INPOT-PORT-NAKB>  (OOC-BP>  (LOCAL-BUFPERS-EMPTY?  1))))) 

(Defrule  GLOBAL-AND-LOCAL-BUPFERS-EMPTY? 

■Global  and  Local  Buffers  Empty  Test* 

: RHS -Node-Types 

(  (CHBCK-LOCAL-NODE-BUPPERS  .  LOCAL-BUFFERS-EMPTY?) 

(CHBCK-GLOBAL-BUPPER  .  QUEUE-EMPTY?}) 

:  Input-Qnbedding 

( ( (GLOBAL-AND-LOCAL-BUPPBRS-EMPTY?  1) 

(CHECK-LOCAL-NODE-BUPPERS  1)} 

( (GLOBAL-AND-LOCAL-BUPPERS-OCPTY?  2) 

(CHECR-GLOBAL-BUPPER  1))) 

:L-R-Link  COMPOSITION 
:Doc 

(■tests  idiether  the  local  buffers  of  the  synchronous  nodes  in  « 
are  all  empty  and  the  global  message  buffer  -A  is  also  espty.* 
( INPOT-PORT-NAME> 

(DOC-BP>  (GLOBAL-AND-LOCAL-BUPPERS-EMPTY7  1))) 

( INPOT-PORT-NANE> 

(00C-6P>  (GLOBAL-AND-LOCAL-BUPPERS-BMPTY?  2))))) 

(Defrule  SYNCHRONOUS-SINULATION-PINISHED? 

■Synchronous  simulation  Pinished?* 

:  RHS-Node-Types 

( (CHECK-ALL-BUPPERS  .  GLOBAL-AND-LOCAL-BUPPBRS-BMPTT?) ) 

: Input-Embedding 

( ( (SYNCHRONOUS-SINULATION-PINISHBD?  1)  (CHECK-ALL-BUPPERS  1)) 

( (SYNCHRONOUS-SIMULATION-FINISHED?  2)  (CHECK-ALL-BUPPERS  2))) 
:St-Thrus 

( ( (STNCKR0N0US-S1MULATI0N-PINISHBD7  1) 
(SYNCHRONOUS-SIMULATION-PINISHED?  3))) 

:L-R-Link  COMPOSITION 
:Doc 

(■tests  %diether  a  synchronous  simulation  is  finished  by  - 
testing  «fhether  the  global  buffer  and  all  of  the  nodes'  ^ 
local  buffers  are  empty.*)) 

(Defrule  EXTRACT-AND-HANDLB-PIRST-NESSAQE 
■Extract  and  Handle  First  Message* 

: RHS-Node-Types 

((HAS-WORK?  .  LOCAL-BUPPER-NONEMPTY?) 

(EXTRACT-FIRST-MSG  .  LOCAL-BUPPER-DQ) 

(RECORO-WORKING-NODE  .  NIW-TERM) 

(HANDLB-THB-MES8AOB  .  HANDLE-MESSAGE)) 

: Edge-List 

( ( (BXTRACT-PIRST-N80  2)  .  (HANDLE-THE-MESSAGE  1)) 

(  (EXTRACT-FIRST-MSG  3)  .  (RECORO-WORKING-NODE  1 ) ) 

(  (RECORO-WORKING-NODE  4}  .  (HANDLB-THB-NBSSAGE  2) )  ) 


(■iteratively  receives  a  synchronous  node  ~A,  extracts  and  handles  its- 
first  message  if  it  has  one  in  its  local  buffer,  and  accumulates  the- 
new  messages  that  this  generates  in  a  global  message  buffer  -A.  This- 
also  creates  new  nodes,  which  are  accumulated  in  an  address-map,  tdtose- 
initial  value  is  -A.* 

(INPOT-PORT-NAME>  (DOC-BP>  (DO-WORK -ACCUMULATION  1))) 

(INPOT-PORT-NAME>  (DOC-BP>  (DO-WORK -ACCUMULATION  4))) 

(lNFOT-PORT-HAME>  (DOC-BP>  (DO-WORK -ACCUMULATION  3))))) 


(Defrule  OO-WORK -accumulate 
•Do  Work  Accumulate* 


:  RHS -Node -Type  s 

( (DW-ACCUMULATION  .  DO-WORK -ACCUMULATION) } 


:  Input-Eaibedding 
(( (DO-WORK -ACCUMUIATE  1) 
( (DO-WORK -ACCUMULATE  2) 
( (DO-WCMIX-ACCUMUIATE  3) 
( (DO-WORK-ACCUMULATE  4) 


(DW-ACCUNULATION  1)) 
(DW-ACCUMULATION  2)) 
(DW-ACCUNULATION  3)) 
(CW-ACCUMULATION  4)  )  ) 


:  Output  -  bibedd  i  ng 

(((DO-WORK-ACCUMULATE  5)  (DW-ACCUNULATION  5}) 

( (DO-WORK -ACOMULATE  6)  (CW-ACCUNULATION  6) )  ) 
:L-R-Link  TEMPORAL-ABSTRACTION 


:Doc 

(•takes  a  series  of  nodes  and  simulates  them  taking  one  step  (i.e.,- 
handling  one  message  a  piece  from  their  local  buffers),  it  - 
accumulates  the  new  nodes  that  this  creates  in  an  address-map,  which  - 
is  given  as  output.  It  also  accumulates  all  new  messages  generated  - 
during  the  node  stepping  in  a  global  message  buffer,  which  it  also  - 
producas  as  output.  The  initial  value  of  the  address-map  is  -A  and  - 
of  the  global  message  buffer  is  -A.^ 

(INPOT-PORT-NANE>  (DOC-BP>  (DO-WORK -ACCUMULATION  3)}) 

(INPOT-PORT-NANE>  (DOC-BP>  (DO-WORK -ACCUMULATION  4)  ))) } 


(Defrule  POLL-NODBS-AND-DO-WORK 
•Poll  Nodes  and  Do  Work* 

:  RHS -Node -IVP*  * 

((POLL-NODES  .  seOUDX;E-AND- INDEX-ENUMERATION) 
(WORK  .  DO-WORK-ACCUMULATE)) 


: Edge-List 

(((POLL-NODES  3)  .  (WORK  2)) 
((POLL-NODES  2)  .  (WORK  1)) 
:  Input-bibedding 
(((POLL-NODBS-AND-DO-WORK  1) 
((POLL-NODBS-AND-DO-WORK  1) 
:  Output  -  bilDedd  i  ng 
(((POLL-NODES-AND-DO-WORK  2} 
((POLL-NODBS-AND-DO-WORK  3) 
:L-R-Link  COMPOSITION 


(W(»K  3} ) 
(POLL-NODES  1))) 

(WORK  5) ) 

(WORK  6})) 


:Doc 

(•polls  all  nodas  in  -A  and  for  each  node  that  has  messages  on  its  > 
local  queue,  it  handles  one  of  the  messages.* 

(INPVr-PORT-NAMK>  (DOC-BP>  (POLL-NODBS-AND-DO-WORK  1))))) 


(Defrule  ADVANCE-NODES 
■Advance  Nodes* 

:  RHS-Node-Types 

((STEP-NODES  .  POLL-NODES -AND-DO-WORK ) ) 
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:  Input  *'Eab«dding 

( UADVANCE'NODBS  1)  (STEP-NODES  1))) 

: Output -fiabcddi ng 

(((ADVANCE-NODES  2)  (STEP-NODES  2)) 

((ADVANCE-NODES  3)  (STEP-NODES  3))} 

:L-R-Link  IHPLEMENTATION 
:Doc 

(*8teps  each  node  in  -A  that  has  work  by  procassing  1  message  - 
each . * 

(INPOT-PORT-NAN£>  (DOC-BP>  (ADVANCE-NODES  1))))) 

(Oefrule  EARLIEST-SINOLATION-FINISHED 
*Earliest  Simulation  Finished* 

: RHS-Node-Types 

I (FINISHED-TEST  .  SYNCHRONOUS-SIMULATION-FINISHED?} ) 

: Input -Embedding 

( { (EARLIEST-SIMULATION-FINISHEO  1)  (FINISHED-TEST  1)) 

( (EaRLIEST-SIMULATION-FINISHED  2}  (FINISHED-TEST  2))} 

: Output -Embedding 

( ( (EARLIEST-SIMULATION-FINISHED  3)  (FINISHED-TEST  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

(*takes  two  input  sequences:  a  sequence  of  address-maps,  - 
starting  with  -A,  and  a  sequence  of  global  message  buffers,  - 
starting  with  -A.  It  outputs  the  first  address-map  in  the  - 
input  sequence  of  address-maps  that  satisfies  the  predicate  - 
that  all  nodes  in  the  address-map  have  empty  local  buffers  - 
and  the  corresponding  global  message  buffer  is  eo^ty.* 
(INPOT-PORT-NAME>  (OOC-BP>  (EARLIEST-SIMULATION-FINISHEO  1))} 
(INPOT-PORT-NAME>  (DOC-BP>  (EARLIEST-SIMULATION-PINISHED  2))))) 

(Defrule  DELIVER-MESSAGES-AND-STEP-NODES 

‘Generate  by  Message  Delivery  and  Node  Stepping* 

: RH S -Node -Type s 

( (DELIVER-ALL-MSGS  .  DELIVER -MESSAGES) 

(STEP-ALL-NOOES  .  ADVANCE-NODES)) 

:Edge-Liat 

(((DELIVER-ALL-MSGS  3}  .  (STEP-ALL-NOOES  1))) 

: Input-Embedding 

( ( (DELIVER-MESSAGES-ANO-STEP-NODES  1)  (DELIVER-ALL-MSGS  2) ) 

{ (DELIVER-MESSAGES-AND-STEP-NOOES  2)  (DELIVER-ALL-MSGS  1))} 
:St-ThrU9 

(  (  (DELIVER-MESSAGES-AND-STEP-NODES  2) 
(OELIVER-MESSAGES-AND-STEP-NODES  4)) 

( (DELIVER-MESSAGES-AND-STEP-NODES  1) 
(DELIVER-MESSAGES-AND-STEP-NOOES  3))) 

:L-R-Link  COMPOSITION 
:Doc 

('generates  address-maps  and  global  message  buffers  by  - 
repeatedly  delivering  all  messages  in  the  global  message  - 
buffer  -A  and  advancing  the  nodes  -A  by  one  step  each.  * 

This  causes  more  messages  to  be  generated  and  added  to  the  - 
global  message  buffer  and  a  new  address-map  to  be  created  - 
on  each  iteration.  Ihe  outputs  of  this  operation  are  2  - 
series:  one  is  the  series  of  address-maps  created  and  the  - 
other  IS  the  series  of  global  message  buffers.* 

( INPOT-PORT-NAME> 

(DOC-BP>  (OELIVER-MESSAGES-AND-STEP-NODES  2))) 

( IMPUT-PORT-NAME> 

(OOC-BP>  (DELIVER-MESSAGES-AND-STEP-NOOES  1 ) } ) ) ) 

(Defrule  GENERATE-GLOBAL-BUFFERS-AND-NODES 
'Generate  Global  Message  Buffer  and  Nodes' 

: RHS -Node -Types 

( (GDl-BUFFER-AND-NODES  .  DELIVER-MESSAGES-AND-STEP-NODES) ) 

:  Input-GMibedding 

( (  (GENERATE-GLOBAL-BUFPERS-AND-NODES  1) 

(GCN-BUFFBR-AND-NQDES  1)) 

( (GENERATB-GLOBAL-BUFPERS-AND-NODES  2) 

(GEN-BUFFBR-AND-NODES  2))) 

: Output -Embedding 

( ( (GENERATE-GLOBAL-BUFPERS-AND-NODES  3) 

(GEN-BUPPER-AND-NODES  3)) 

(  (GENERATE-GLOBAL-BUPPERS-AND-NODES  4) 

(GEN-BUPPER-AND-NODES  4 )  ) ) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:DOC 

('generates  address-Mps  and  global  message  buffers  by 
r^Mstedly  delivering  all  messages  in  the  global  message  - 
buffer  -A  and  advancing  tha  synchronous  nodes  in  -A  by  one  - 
step  each . * 

( INIVr-PORT-NAME> 

(DOC-BP>  (ODIERATE-GLOBAL-BUPPERS'AND-NOOBS  2) ) ) 

( INPVr-PORT-NAME> 

(DOC-BP>  (GENBRATE-GLOBAL-BUPPERS-AMD-NODES  1))))) 

(Defrule  SYNCHRONOUS-SINULATION-W-OLOBAL-KESSAGE-BUFPER 
'Synchronous  Simulation  using  Global  Message  Buffer* 
:RHS-Node-1VP«e 

((INITIAL- INSERT  .  QUEUE- INSERT) 

(SIMULATION-STEP  .  GENERATE-6L0BAL-BUPPERS-AND-N0DES) 
(SIMULATION-FINISHED?  .  EARLIEST-SIMULATION-PINISHED)  ) 

:Bdge-Li st 


(((INITIAL- INSERT  3)  .  (SIMULATION-STEP  2) ) 

((SIMULATION-STEP  4)  .  (SIMULATION-FINISHED?  2 } ) 

( (SIMULATION-STEP  3)  .  (SIMULATION-FINISHED?  1)J) 

: Input-Embedding 

( ( (SYNCHR0N0US-SIMULAT10N-V)-GL0BAL-MESSAGE-BUFP£R  1)  ISIMULATION-STEP  1>. 

{(SVNCKROMOUS-SIMULATION-W-GLOBAL-MESSAGE-BUFFER  2)  I INITIAL-INSERT  i))) 
:Output -Embeddi ng 

( ( (SYNCHRONOUS-SIMULATION-W-GLOBAL-MESSAGE-BUFFER  3) 

(SIMULATION-FINISHED?  3))) 

:L-R-Link  COMPOSITION 
:Doc 

('Iteratively  advances  each  synchronous  node  in  -A  by  handling  one  - 
message  a  piece.  It  uses  a  global  message  buffer  to  ensure  that  - 
nodes  advance  in  lock-step.  The  global  buffer's  initial  value  is  - 
-A.  The  simulation  starts  by  adding  an  initial  message  -A  to  -A.  - 
The  simulation  ends  when  no  node  has  work  to  do  (i.e.,  no  more  - 
messages  to  handle)  and  the  global  message  buffer  -A  is  empty.  ~ 

As  messages  are  handled,  new  messages  are  created  which  are  - 
buffered  on  the  global  message  buffer.* 

( INPUT-PORT-MAME> 

(DOC-BP>  (SYNCHRONOUS-SIMULATION-W-GLOBAL-MESSAGE-BUFFER  1))) 
(rNPOT-PORT-NAME>  {DOC-BP>  ( INITIAL-INSERT  2) ) ) 

( INPUT-PORT-MAME> 

(DOC-BP>  (SYNCHRONOUS-SIMULATION-W-GLOBAL-MESSAGE-BUFFER  2))) 
(INPOT-PORT-NAME>  (DOC-BP>  (INITIAL-INSERT  2) ) ) 

(INPOT-PORT-NAME>  (DOC-BP>  (INITIAL-INSERT  2) ))) ) 

(Defrule  SYNCHRONOUS-SIMULATION 

'Synchronous  Simulation  using  Global  Buffer* 

: RH S-Node -Type s 

((SIMULATE-W-BUFPER  .  SYNCHRONOUS-SIMULATION-W-GLOBAL-MESSAGE-BUFFER) ) 

: Input-Embedding 

(((SYNCHRONOUS-SIMULATION  1)  (SIMULATE-W-BUFFER  1)) 

( (SYNCHRONOUS-SIMULATION  2)  (SIMULATE-W-BUFFER  2))) 

:  Output  -Enbeddi  ng 

(((SYNCHRONOUS-SIMULATION  3)  (SIMULATE-W-BUFFER  3))) 

:L-R-Link  IMPLQfEMTATION 
:Ooc 

('synchronously  simulates  a  collection  of  processing  nodes  hemdlmg 
messages.  The  synchronous  nodes  (which  represent  the  processing  - 
nodes)  are  collected  in  an  address-map,  called  -a.  Each  node  - 
maintains  a  local  buffer  of  pending  messages  to  handle.* 

( INPUT- PORT-NAME>  (DOC-BP>  (SYNCHRONOUS-SIMULATION  1))))) 

(Defrule  SNUMERATE-NODES+COMPUTE-AVERAGE 
'Enumerate  Nodes  and  Compute  Average* 

: RHS-Node-Types 

((onjK-NODES  .  SEQUENCE-AND- INDEX-ENUMERATION) 

(COMPOTE-BUPFER-SIZE  .  SUM) 

(SIZE-OF-SEQUENCE  .  SEQUENCE-SIZE) 

(COKPUTE-AVC  .  DIVIDE) ) 
lEdge-List 

(((EMUM-NODES  2)  .  (COMFUTE-BUFFER-SIZE  1 ) ) 

((COMPOTE-BUFFER-SIZE  2)  .  (COMPUTE-AVC  1)) 

( (SXZB-OF-SEQUQJCE  2)  .  (COMPUTE-AVG  2) ) ) 

:  input-BKbedding 

<<<EMUNERATE-N0DE5>fC0MFUTE -AVERAGE  1)  (SIZE-OF-SEQUENCE  1)) 

( (SIUKERATE-NODES<fCOMPUTE-AVERAGE  1)  (BRM-NODES  1))) 

:  Output  -  OilDedd  i  ng 

( ( (ENOMERATE-NODES+COMPOTE-AVERAGE  2)  (COMPOTE-AVG  3))) 

:L-R-Link  COMPOSITION 
:Doc 

('enumerates  all  nodes  in  >A  and  computes  the  average  of  the  sizes 
of  their  local  buffers.* 

( INPUT-PORT -NAMB>  (DOC-BP>  (ENUMERATE-NOOBS*COMPUTB-AVERAGE  1))))) 

(Defrule  AVERACE-L0CAL-BUFFER-5IZE 
'Average  Local  Buffer  Size* 

:RHS-Node-Types 

( (AVG-LB-SIZE  .  ENUMERATE-NODES-fCOMPUTE-AVERAGE)  ) 

: I npu t - Embedd 1 ng 

(((AVERAGB-LOCAL-BUFFER-SIZE  1)  (AVG-LB-SIZE  1))} 

: Output -anbeddi ng 

(((AVERAGE-LOCAL-BUFFER-SIZE  2)  (AVG-LB-SIZE  2))) 

:L-R-Link  INPLOONTATION 
:Doc 

('computes  the  average  of  the  local  buffer  sizes  of  all  nodes  in  -A.* 
(XMPUT-PORT-NAME>  {DOC-BP>  (AVERAGE-LOCAL-BUFPER-SIZE  1))))) 

(Defrule  DESTRUCTIVE- ;.UEUE-ENUMERATlON 
'Destructive  Queue  Enumeration* 

:RHS-Node-Types 

((ENUM-PQ  .  PQ-ENUNERATION)  ) 

:  Input-bibedding 

( ( (DESTRUCTIVE-QUEUE-ENUMERATION  1)  (CNUN-PQ  1) 

PRIORITY -OOEUE>QUEUE) ) 

:  Ou  t  pu  t  -  finbedd  i  ng 

(((DESTRUCTIVE-QUEUE-ENUMERATION  2)  (ENUM-PQ  2))) 

:L-R-Link  IMPLEMENTATION 
:DOC 

(*<9«structively  enumerates  the  Queue  -A,  which  is  implemented-%- 
as  a  Priority  Queue.* 

(XNPUT-PORT-NAME>  (DOC-BP>  (OESTRUCTIVE-QUEUE-OWMERATION  1))))) 
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(D«fruU  DESTRUCTZVE-QUBUB>EMUKBRAT10N 
*D«structiv«  Ou«u«  BnuMratxon* 

: RHS-Nod«-Typ«S 

((BNUM-PIPO  .  PZPO*DESTRUCTZVE-EIIUICSRATXON) ) 

:  Znput'fiHbaddxng 

( ( (DBSTRUCTZVE-QUZUS'EHUNERATZON  1)  (ENUM-FZPO  1) 
FXPO>OIZEUE) ) 

.‘Oucpuc  -Babaddi  ng 

( ( (DESTRUCTZVE-QUEITE-ENUMERATZON  2)  (EMUM-FIFO  2))} 

:L-R>Lini(  ZNPLDOSfTATZON 

:Doc 

( *d«strucciv«ly  anuiiarataa  th«  Quaua  ~A,  which  is  ~ 
iapleMncad  as  a  FZFO.* 

{ ZNPUT-PORT-NAME> 

(DCX;-BP>  (DESTRUCTZVE-QUEU£>BRJMERATZON  1))))) 

(DafruU  DESTRUCTZVE>QUEUB-ENUIIERATXON 
‘Oastnictiva  Quaue  Bnuneration* 

:RHS'Node>lVpaa 

((EMUH-STACK  .  STACK-ENUNBRATZON) ) 

: input-Embadding 

( ( (DBSTRUCTIVE'QUEUE-EMUMERATZON  1)  (ENUN>STACR  1) 
STACR>QUEUB) } 

:  Ou  t  pu  t  •  Eaibadd  i  ng 

( ( (DESTRUCTZVE-QUEUE-ENUMERATXON  2)  (EMUM-STACR  2)]) 

:L-R-Linlc  ZMPLEMENTATZON 

:Doc 

( 'dastruccivsly  enumerates  the  Queue  -A,  which  is  ~ 
implemented  as  a  Stack.' 

( ZNPOT-PORT-NAME> 

(DCX:-BP>  (DBSTRUCTZVB-QUEUE-ENUICERATION  1))))) 

(Defrule  STACK-ENUMERATION 
■Stack  Enumeration* 

: RHS-Node-Types 
(  (ENUM-LL-DESTRUCTZVELY  .  LE) } 

: Input-Embedding 

( ( (STACR-EMUMERATZON  1)  (ENUM-LL-DESTRUCTZVELY  1) 
LXNKED-L1ST>$TACK) ) 

:  Output  -EAbeddi  ng 

( (  (STACR-EMUMBRATION  2)  (EMUM-LL-OESTRUCTZVELY  2)  J  ) 

:L-R-Link  IMPLEMENTATION 

:Ooc 

(■destructively  enumerates  the  Stack  *A,  udiieh  is  - 
iaE>lemented  as  a  Linked-List. ' 

(INPCrr-PORT-NAME>  (DOC-BP>  (STACK-ENUMERATION  1 )))) ) 

(Defrule  STACK -ENUMERATION 
■Stack  Enumeration* 

:RHS -Node-Types 

( (DIUN-IS-DESTRUCTZVELY  .  INOEXED-SEQUENCE-QIUMERATZON) ) 
: Input-Osbedding 

( ( (STACK-Q4UNERATZ0N  1)  (ENUN-ZS-DESTRUCTZVELY  1} 
INDEXED-SEQUENCE>STACK) ) 

: Output -Embedding 

(((STACK-ENUMERATION  2)  (BNUM-IS-DESTRUCTZVELY  2))) 

:L-R-Link  IMPLEMENTATION 

:Doc 

(■destructively  enumerates  the  Stack  -A,  %«hich  is  - 
iiB>lemented  as  an  Indexed  Sequence.* 

(1NPOT-P0RT-NAMB>  (DOC-BP>  (STACK-ENUMERATION  1))))) 

(Defrule  QUEUE-EXTRACT 
■Queue  Extract* 

:  RHS-Node-IVpes 

( (EXTRACT-FRON-PQ  .  PQ  -  EXTRACT } ) 

:  Input-Dsbedding 

(((QUEUE-EXTRACT  1)  (BXTRACT-FROM-PQ  1) 

PRIORITY -QUEUB>QUEUE) ) 

:  Output -aibeddi  ng 

( ( (QUEUE-EXTRACT  2)  (EXTRACT-  FROM-PQ  2) } 

((QUEUE-EXTRACT  3)  (EXTRACT  FROM-PQ  3) 

PRIORITY -QUEUE>QUEUE)  ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(■extracts  an  element  from  the  queue  -A,  which  is 
implemented  as  a  Priority  Queue.* 

(INPOT-PORT-NAME>  (DOC-BP>  (QUEUE- EXTRACT  1]}))) 

(Defrule  QUEUE-EXTRACT 
■Queue  Extract ■ 

:  RHS-ttode-iypes 

( (EXTRACT-FRON-FIFO  .  FIFO-OEQUEUB) ) 

: Input-Embedding 

(((QUEUE-EXTRACT  1)  (EXTRACT-FROM-FIFO  1) 

FIPOQUEUE)) 

: Output -Embeddi ng 

(((QUEUE-EXTRACT  2)  (EXTRACT-PRON-PIFO  2) ) 
((QUEUE-EXTRACT  3)  (EXTRACT-FROM-FIFO  3} 

FZFOXJUBUE)) 

tL-R-Link  IMPLEMBFTATION 
:Doc 

(■extracts  an  element  from  the  queue  -A,  which  is  - 


implMiented  as  a  FIFO.* 

(INPOT-PORT-NAME>  {DOC-BP>  (QUEUE -EXTRACT  1))))) 

(Defrule  QUEUE-EXTRACT 
■Queue  Extract* 

: RKS -Node -Types 

( (EXTRACT-FROM-STACK  .  STACK-POP)) 

: Input -Embedding 

(((QUEUE-EXTRACT  1}  (EXTRACT-FROM-STACK  1) 

STACR>QUEUE) } 

:Otttput  -Embeddi  ng 

(((QUEUE-EXTRACT  2)  (EXTRACT-FROM-STACK  2)  ) 

{(QUEUE-EXTRACT  3)  (EXTRACT-FROM-STACK  3) 

STACK>QUEUB)  ) 

:L-R-Link  IMPLEMSITATION 
:Doc 

(■extracts  an  element  from  the  queue  -A.  which  is  implemented  as  a  - 
Stack.* 

(INPUT-PORT-NMIE>  (DOC-BP>  (QUEUE-EXTRACT  1))))) 

(Defrule  QUEUE-INSERT 
■Queue  Insert* 

:RMS-Node-Types 
( (AOD-TO-Q3  .  PQ-INSERT)  ) 

: input -Embedding 

(((QUEUE-INSERT  1)  (ADD-TO-Q3  1)) 

((QUEUE-INSERT  2)  (ADD-TO-Q3  2) 

PRIORITY-QUEUE>QUEUE>  > 

: Output - Bmbedd i ng 
(((QUEUE-INSERT  3)  (ADD-TD-Q3  3) 

PRIORITY -QUEUE>QUEUE)  ) 

:L-R-Link  IMPLEMENTATION 
:DOC 

(■enqueues  -A  on  the  Queue  -A.  which  is  i^lemented  as  a  - 
Priority-Queue . • 

{INPUT-PORT-NAME>  (DOC-BP>  (QUEUE-INSERT  1))) 

(lNP(7r-PORT-NAME>  (DOC-BP>  (QUEUE-INSERT  2))))) 

(Defrule  QUEUE-INSERT 
■Queue  Insert* 

:RMS-Node-Types 
( (AZ>D-TO-Q2  .  FIFO-ENQUEUE)  ) 

:  Zi^ut  -Embeddi  ng 

(((QUEUE-INSERT  1}  (ADO-TD-Q2  1)) 

((QUEUE-INSERT  2)  (ADD-TO-Q2  2) 

FIFO>QUEUB)) 

:Output -Embeddi  ng 

(((QUEUE- INSERT  3)  (AOD'TO-Q2  3) 

PIFO>QUBUE) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(■enqueues  -A  on  the  Queue  •'A.  which  is  implemented  as  a  FIFO.* 
(INPOT-PORT-NAME>  (DOC-BP>  (QUEUE- INSERT  1))) 

(INPUT-PORT-NAME>  (DOC-BP>  (QUEUE- INSERT  2))))) 

(Defrule  QUEUE-INSERT 
■Queue  Insert* 

:RHS-Node-lVpes 
( (AIM>-TO-Ql  .  STACK-PUSH) ) 

:  Ir^t-Oibedding 

(((QUEUE- INSERT  1)  (ADD-TO-QI  1)} 

((QUEUE- INSERT  2)  (ADD-TO-Ql  2) 

STACK>QUEUE)  ) 

:  Out  pu  t  -  Bmbedd  i  ng 

((  (QUEUE- INSBRT  3)  (ADD-TO-Ql  3) 

STACK>QUEUE) } 

:L-R-L2nk  INPLBMBFTATION 
:Doc 

(■en^eues  -A  on  the  Queue  -A.  which  is  implemented  as  a  Stack.* 
(INPUT-PORT-NAMB>  (DOC-BP>  (QUEUE- INSERT  1))) 

(INPUr-PORT-NAMB>  (DOC-BP>  (QUEUE-INSERT  2)  ))) ) 

(Defrule  QUSUE-BMFTY? 

■Queue  BMpty?* 

:R}iS-ttode-Types 
((BIPTT37  .  PO-mPTY)) 

:  Input -bibedding 
( ( (QUBUE-BNFTY?  1}  (DIFrY37  1) 

PRIORITT-QUEUE>QUEUE) ) 

:L-R-Link  INFLBMBNTATION 
:Doe 

(■tests  whether  the  Queue  -A  is  empty. -4- 
Hie  Queue  is  implemented  as  a  Priority-Queue.* 

(ZNPUr-PORT-NANB>  (DOC-BP>  (QUEUE-EMPTY?  1))))) 

(Defrule  QUBUE-BIFTT7 
■Queue  topty7* 

;  RHS -Node -lypes 
((BCFTY27  .  F1PO-EMFTY7)  ) 

:  If^t- tobedding 
( (  (aUEDS-IMPTY7  1)  (BfPTY27  1) 

FIPOQUEUE)) 

:L-R-Link  IMPLEMENTATION 
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:DOC 

{■t*«ts  ch*  Qu«u*  -A  i*  •a^ty.~%-> 

Th«  Qu«u«  IS  iiip2«a«nc«d  «•  a  FIFO.* 

(INPUT-PORT-NAME>  (DOC-BP>  (QUEUE-EMPTY?  1)}))) 

(Dafrule  QUEUE-EMPTY? 

*Qu*u*  Empty?* 

:RHS>Node-Typ«s 
(tEMPTYl?  .  STACK-EMPTY?)) 

: Input-Enbadding 
(((QUEUE-EMPTY?  1)  (EMPTYl?  1) 

STACK>QUEUE) } 

:L-R-Link  IMPLEMENTATION 
:Doc 

('tests  whether  the  Queue  -A  is  empty. 

The  Queue  is  in^lemented  as  a  Stack.* 

(IMPUT-PORT-NAME>  (DOC-BP>  (QUEUE-EMPTY?  1))))) 

(Defruie  STACK-DIPTY? 

'Stack  Empty?' 

: RMS -Node -Types 
((LL-EMPTY?  .  LIST-EMPTY)) 

: Input-Embedding 
(((STACK-EMPTY?  1)  (LL-EMPTY?  1) 

LINKED-LIST>STACK) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

('tests  whether  the  Stack  -A  is  empty. -4- 
The  Stack  is  implemented  as  a  Linked  List.* 
(INPUT-PORT-NAME>  (DOC-BP>  (STACK-EMPTY?  1))))) 

(Defruie  STACK-EMPTY? 

•Stack  Empty?* 

: RHS -Node -IVP^s 

(dS-EMPTY?  .  INDEXEO-SEQUENCE-EMPTY]  ) 

: 1 nput - Embeddi ng 
(((STACK-EMPTY?  1)  (IS-EMPTY?  1) 

INDeX£D-S£QUQICE>STACK) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(•tests  whether  the  Stack  -A  is  empty.-!- 
The  Stack  is  implemented  as  an  Indexed  sequence.* 
(INPUT-PORT-NAME>  (DOC-BP>  (STACK-EMPTY?  1))))) 

(Defruie  STACK-PUSH 
•Stack  Push' 

: RHS -Node -Types 
((AOO-TO-LL  .  LIST-PUSH)) 

: Input-Embedding 
(((STACK-PUSH  1)  (ADD-TO-LL  1)) 

( (STACK -PUSH  2)  (ADD-TO-LL  2) 

LINKED-LIST>STACK) ) 

: Output - Embedd i ng 
(((STACK-PUSH  3)  (ADD-TO-LL  3) 

LINKED-LIST>STACK) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

('pushes  -A  onto  the  stack  -A,  which  is  implemented  as  a  - 
Linked  List . * 

(INPOT-PORT-NAME>  (D0C-8P>  (STACK-PUSH  1))) 
(INPUT-PORT-NAME>  (DOC-BP>  (STACK-PUSH  2))))) 

(Defruie  STACK-PUSH 
'Stack  Push* 

: RHS-Node-Types 

((ADD-TO-IS  .  INDEXED-SEQUENCE- INSERT) ) 

:  Input-bilDedding 
(((STACK-PUSH  1)  (ADD-TO-IS  1)) 

((STACK-PUSH  2)  (ADD-TO-IS  2) 
lNDEXEO-SEQUtNCE>STACK) ) 

: Ou t pu t - Embedd i ng 
(((STACK-PUSH  3)  (ADD-TO-IS  3) 

IHDEXBO-SEQUENCE>STACK) ) 

:L-R-Link  IMPLEMQYTATION 
:Doc 

('pushes  -A  onto  the  stack  -A,  which  is  implemented  as  an  - 
Indexed  Sequence.' 

(IMPUT-PORT-NAME>  (DOC-BP>  (STACK-POSH  1))) 

( INPUT- PORT-NAME>  (DOC-BP>  (STACK-PUSH  2)}))) 

(Defruie  STACK-POP 
•Stack-Pop' 

: RHS-Node-Types 

(  (EXTRACT-KROM-LL  .  LIST-POP)) 

: Input-Embedding 

(((STACK-POP  1)  (EXTRACT-PROM-LL  1) 

LINKEO-LIST>STACK) ) 

: Ou t pu t - Embedd i ng 

(((STACK-POP  2)  (EXTRACT-PROM-LL  2) ) 

((STACK-POP  3)  (EXTRACT-PROM-LL  3) 

LINKED-LIST>8TACK) ) 

:L-R-Link  INPLEMEMTATION 
:Ooc 


('pops  the  stack  -A,  «rhich  is  implemented  as  a  Linked  List.* 

(INPUT- PORT -NAME>  (DOC-BP>  (STACK-POP  1))))) 

(Defruie  STACK-POP 
“Stac)f-Pop' 

: RHS-Node-Types 

((EXTRACT-FKOM-IS  .  INDEXED-SEQUENCE-EXTRACT) ) 

:  Input-Embedding 

(((STACK-POP  1)  (EXTRACT-FROM-IS  1) 

INDEXED-SEQUENCE>STACK) ) 

:  Out  pu  t  -  Embedd  i  ng 

(((STACK-POP  2)  (EXTRACT-FROM-IS  2)) 

((STACK-POP  JJ  (EXTRACT-FROM-rS  3) 

INDEXEO-SE0UENCE>STACK) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

Cp^s  the  stack  -A.  which  is  implemented  as  an  i ndexed- sequence .  * 
(INPUT-PORT-NAME>  (DOC-BP>-  (STACK-POP  1))))) 

(Defruie  CIS-DESTRUCTIVE-ENUMERATION 

'Circular-lndexed-Sequence  Destructive  Enumeration* 

: RHS -Node -Types 
((ENUM-FINISHED?  .  CIS-EMPTY) 

(EXTRACT-NEXT  .  CIS-EXTRAC7n 
:  Input -Embeddi  ng 

(((CIS-DESTRUCTIVE  ENUMERATION  1)  (EXTRACT-NEXT  D) 

( (CIS-OESTRUCTIVL-ENUMERATION  1)  (ENUM-FINISHED?  1))) 

:  Output  -  Bnbeddi  ng 

({(CIS-DESTRUCTIVE-ENUMERATION  2)  (EXTRACT-NEXT  2)  )  ) 

:L-R-Link  CTOMPOSITION 
:Ooc 

('enumerates  all  of  the  elements  in  the  Circular-Indexed-Sequence  -A 
by  destructively  extracting  them  from  the  sequence.  The  sequence 
is  filled  in  -A.* 

<INPOT-PORT-NAME>  (OOC-BP>  (CIS-DESTRUCTIVE-ENUMERATION  1))) 
(GROirrH-DlRECTION  (N>  CIS-DESTRUCTIVE-ENUMERATION)))) 

(Defruie  FIFO-DESTRUCTIVE-ENUMERATION 
'FIFO  Destructive  Enumeration* 

: RHS -Node -Types 

( (ENUM-CIS-DESTRUCTIVELY  .  CIS-DESTRUCTIVE-ENUMERATION)) 

: Input-Embedding 

(((FIFO-DESTRUCTIVE-ENUMERATION  1}  (ENUM-CIS-DESTRUCTIVELY  1) 
CIRCULAR'INDEXEO-SEQUENCE>PIFO) ) 

: Out put - Embedd i ng 

(((FIFO-DESTRUCTIVE-ENUMERATION  2)  (ENUM-CIS-DESTRUCTIVELY  2))) 

:L-R-Link  XMPLSIENTATtON 

:Doc 

('destructively  enumerates  the  FIFO  queue  -A,  which  is  implemented  - 
as  a  Circular  Indexed  Sequence.' 

(INIVr-PORT-NAME>  (DOC-BP>  (FIFO-DESTRUCTIVE-ENUMERATION  1))))) 

(Defruie  CIS-DfPTY 
•CIS 

:RHS-Node-Type8 

((ZERO-riLL-COUNT?  .  COMMUTATIVE-BINARY-FUNCTION) 

(TEST-EQUALITY  .  NULL-TEST) ) 

:Edge-Li8t 

(((ZERO-FILL-COONT?  3)  .  (TEST-EQUALITY  1))) 

: Input-Embedding 

(((CIS-QJPTY  1)  (ZERO-FILL-COUNT?  1) 

FILL-COUNT) ) 

:L-R-Link  COMPOSITION 
:Doc 

('tests  «^ether  the  Circular-Indexed-Sequence  ^A  is  empty.* 

( INPUT-PORT-NAME>  (DOC-BP>  (CIS-EMPTY  1))))) 

(Defruie  FIFO-EMPTY? 

•FIFO  EB?>ty' 

: RHS-Node -Types 
((CIS-EMPTY?  .  CIS-EMPTY)) 

: Input -Embeddi ng 

( ( (riFO-OfPTY?  1)  (CIS-EMPTY?  1) 

CIRCULAR-INDEXED-SEQUQCE>FIFO}  ) 

:L-R-Link  IMPLEMQFTATION 
:Doc 

('tests  whether  the  FIFO  queue  ^A  is  empty.  The  FIFO  is  implemented 
as  a  Circular  Indexed  Sequence.* 

(INPUT-PORT-NAME>  |DOC-BP>  (FIFO-EMPTY?  1))))) 

(Defruie  CIS-FULL 
•CIS  Full' 

: RHS-Node -Types 
( (ONE-LESS  .  DECREMSFT) 

(MAX-FILL-COUNT?  .  LT) 

(TEST-COMPARISON  .  NULL-TEST) ) 

: Edge-List 

(((ONE-LESS  2)  .  (MAX-FILL-COUNT?  2)) 

( (NAX-FILL-COUNT?  3)  .  (TEST-COMPARISON  1))) 

:  Input-tebedding 
(((CIS-FULL  1)  (ONE-LESS  1) 

SIZE) 

((CIS-FULL  1)  (MAX-FILL-COONT?  1)  FILL-COUNT)) 
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:L-R-Link  COMPOSITION 
:Doc 

('tests  whether  the  Circular-Indexed-Sequence  "A  is  full.* 
(INPUT-PORT-NAME>  (DOC-BP>  (CIS-FULL  1))))) 


(Defrule  GROW-CIS 

•Grow  Circuler-Indexed-Sequence* 

;  RHS-Node-Types 

( (THE-GROWER  .  INTERMEOIATE-GROW-CIS) ) 

:  Input-Embedding 

(((GROW-CIS  1)  , THE-GROWER  1))) 

:  Output  -Bnbeddi  ng 
(((GROW-CIS  2)  (THE-GROWER  3))) 

:L-R-Link  COMPOSITION 
:Ooc 

(•makes  a  new  Circular  Indexed  Sequence  that  is  double  the  - 

size  of  the  Circular  Indexed  Sequence  -A  and  then  - 

transfers  all  of  the  elements  of  •'A  to  the  new  CIS.  The  ~ 

new  CIS's  First  is  at  index  0  and  its  Last  is  at  index  s  ~ 

the  number  of  elements  in  the  sequence. -%~ 

The  new  sequence  grows  -A.' 

( INPUT-PORT-NAME>  (DOC-BP>  (THE-GROWER  1))) 
{INPUT-PORT-NAME>  (DOC-BP>  (THE-GROWER  1))) 

(GROWTH -DIRECTION  (N>  THE-GROWER) )) ) 


(Defrule  INTERMEDIATE-GROW-CIS 

•Grow  Circular-Indexed-Sequence  (Intermediate)* 

: RHS-Node-Types 

(  (QRJMERATE-WHOLE-CIS  .  BOUNDED-CIS-ENUMERATION) 
(DOUBLE-SIZE  .  DOUBLE) 

(MAKE-NEW-BASE  .  NEW-SEOUSKTE) 
(SUCCESSIVE-INDICES  .  COUNT) 
(ACCUMULATE-NEW-BASE  .  SEQUE24CE-ACCUMULATE) ) 


: Edge-List 

( ( (ENUMERATE-WHOLE-CIS  5)  .  (ACCUMULATE-NEW-BASE  1) ) 
((DOUBLE-SIZE  2)  .  (MAKE-NEW-BASE  1)) 

( (MAKE-NEW-BASE  2)  .  (ACCUMULATE-NEW-BASE  3 ) ) 


( (SUCCESSIVE- INDICES  2) 
: Input-Embedding 
( ( (INTERMEDIATE-GROW-CIS 
BASE) 

( (INTERMEDIATE-GROW-CIS 
FIRST) 

(  (IfTTERMEDlATE-GROW-CIS 
FILL-COUNT) 

( (INTERKEDIATE-GROW-CIS 

I  (INTERMEDIATE  GROW-CIS 
SIZE) 

( (INTERMEOIATE-GROW-CIS 
:  Output  -eM>edding 
(  ( (INTERMEOIATE-GROW-CIS 
BASE] 

( ( INTERMEOIATE-GROW-CIS 
SIZE)  ) 

:St-Thrus 

( ( ( INTERMEDIATE-GROW-CIS 

( (INTERMEDIATE-GROW-CIS 
PILL-COUNT) 

( (INTERMEDIATE-GROW-CIS 
FILL-COUNT) ) 

:L-R-Link  COMPOSITION 


.  (ACCUMULATE-NEW-BASE  2)  ) ) 


1) 

(ENUMERATE-WHOLE-CIS 

i) 

1) 

( ENUMBRATE-WHOLB-Cl S 

2) 

1) 

( ENUMERATE-WHOLE-CIS 

3) 

1) 

(DOUBLE-SIZE  1)  SIZE) 

1) 

( B4UMERATE-WK0LE-C1S 

41 

21 

(SUCCESSIVE-INDICES  1))) 

31 

(ACCUMULATE-NEW-BASB 

4) 

3) 

(DOUBLE-SIZE  2) 

2) 

(INTERNBDIATE-GROW-CIS  3)) 

1) 

(INTERMEDIATE-GROW-CIS  3) 

1) 

(INTERNEOIATE-GROW-CXS  3) 

:Doc 

(•intermediate  non-terminal i  Grow-cis.*)) 


(Defrule  COMBINATION-FUNCTION 
•Combination  Function* 

:  RHS-Node-Types 
( (SUBTRACT-THEM  .  MINUS)) 

: Input-Embedding 

(( (COMBINATION-FUNCTION  1)  (SUBTRACT-'niBM  1) ) 

(  (COMBINATION-FUNCTION  2)  ( SUBTRACT-THSI  2 ) ) ) 

:  Ou  t  pu  t  -  BmJMdd  i  ng 

(  ( (COMBINATION-FUNCTION  3)  (SUBTRACT-THOI  3)  )  ) 

:L-R-Link  COMPOSITION 
:Ooc 

(•subtracts  -A  from  -A.* 

( lNFUT-PORT-NANE>  (OOC-BP>  (COMBINATION-FUNCTION  2))) 

( INPUT-PORT-NAME>  (DOC-BP>  (COMBINATION-FUNCTION  1})))) 


(Defrule  COMBINATION-FUNCTION 
•Combination  Function* 

:RH8-Node-Types 

((SUM-THBM  .  COIOfUTATlVE-BINARY-FUNCTlON) ) 

:  Input-ttibedding 

(((COMBINATION-FUNCTION  1)  (SUM-THIN  1)) 
((COMBINATION-FUNCTION  2)  (SUM-THBf  2))} 

:  Ou  t  pu  t  -  ttibedd  i  ng 

(((COMBINATION-FUNCTION!)  (SUM-THEN  3) ) ) 

:L-R-Link  COMPOSITION 
;Ooc 

(•combines  -A  and  ~A  by  adding  them  to  each  other.* 
(lNPUT-PORT-NAMB>  (OOC-BP>  (COMBINATION-FUNCTION  1))) 
(1NPOT-P0RT-NAME>  (DOC-BP>  (COMBINATION-FUNCTION  2))))) 


(Defrule  BOUNDED-CIS-ENUMERATION 

•Bounded  Circular-lndexed-Sequence  Enumeration* 

: RHS-Node-Types 

( (COUNT-N-TIMES  .  BOUNDED-COUNT) 

(COMBINE-COUNT-FIRST  .  COMBINATION-FUNCTION) 

(WRAP-INDEX  .  MOD) 

(MAP-ACCESS-CIS  .  SELECT-TERM) ) 

:Edge-List 

( ( (COUNT-N-TINES  3}  .  ICOMBINE-COUNT-FIRST  2)) 

((COMBXNE-COUNT-FIRST  3)  .  (WRAP-INDEX  D) 

((WRAP-INDEX  3)  .  (MAP-ACCESS-CIS  2) ) ) 

:  1  n^t  -  Embedd  i  ng 

(((BOUNDED-C1S-Q4UMERATION  1)  (MAP-ACCESS-CIS  1}) 
((BOUNDED-CIS-QRJMERATION  2)  (COMBINE-COUNT-FIRST  1)) 
((BOUNDED-CIS-ENUMERATION  3)  (COUNT-N-TIMES  2)) 

( (BOUNDED-CIS-ENUMERATION  4)  (WRAP-INDEX  2) ) ) 

:  Output  -  Bnbedd  i  ng 

(((BOUNDED-CIS-ENUMERATION  S)  (MAP-ACCESS-CIS  3))) 

:L-R-Link  COMPOSITION 
:Ooc 

(•enuMrates  N  elements  of  the  Circular-lndexed-Sequence  -A  starting  - 
from  -A,  where  N  s  -a.  The  sequence  is  filled  in  -A.* 

( INPUT- PORT-NAME>  (DOC-BP>  (BOUNDED-CIS-QRIMERATION  1))) 
(lNPUT-PORT-KAM£>  (DOC-BP>  (BOUNDED-CIS-ENUMERATION  2))) 
(INPUT-PORT-NAM£>  (DOC-BP>  (BOUNDED-CIS-ENUMERATION  3))} 
(GROWTH-DIRECTION  (N>  BO(AlDED-CIS-ENUMERATION)  )  )  ) 


(Defrule  CIRCULAR-INDEXED-SBQUENCE-BKMERATION 
•Ci rcu 1 ar- 1 ndexad-Sequence  Enumer at i on • 

: RHS -Node -Types 

((S<UMERATE-D4TIRE-CIS  .  fiOUNDED-CIS-QIUMERATION)  ) 


:  Input  -  bibeddi  ng 

(((CIRCULAR-INDEXED-SEQUENCE-ENUMERATION  1) 
BASE) 

((CIRCULAR-INDEXED-SEQUENCE-Q4UMERATION  1) 
FIRST) 

((CIRCULAR-INDEXED-SEQUOXTE-ENUMERATION  1) 
FILL-COUNT) 

((CIRCULAR-INDEXED-SEQUOXTE-ENUMERATION  1) 
SIZE) ) 

:Output -Embedding 

(((CIRCULAR-INDBXED-SEQUEMCE-ENUMERATION  2) 
:L-R-Link  IMPLEMENTATION 


(ENUMERATE-BNTIRE-CIS  1) 
(ENUMERATE-ENTIRE-CIS  2) 
(ENUMERATE-BNTIRE-CIS  3) 
(ENUMERATE-ENTIRE-CIS  4) 

(ENUMERATE-ENTIRE-CIS  S))) 


:Doc 

(•enumerates  all  of  the  elements  in  the  Circular-lndexed-Sequence  >A.  • 
Hie  sequence  ia  filled  in  ••A.^ 

(INraT-PORT-NANE>  (DOC-BP>  (CIRCULAR-INDEXED-SEQUENCE-ENUMERATION  1))) 
(GROWTH-DIRECTION  (N>  CIRCULAR-INDEXBD-SEQUENCE-ENUMERATION)  )  )  } 


(Defrule  FIFO-enuneration 
•FIFO  Enumeration* 

:RHS-Node-Type8 

((QIUNERATE-CIS  .  CIRCULAR-INDEXED-SEQUSICE-SIUMERATION)  ) 

:  Input-Em]9edding 

(((FIFO-a4UMBRATION  1)  (ENUNERATE-CIS  1) 
CIRCULAR-INDBXED-SBQUENCE>FIFO) ) 

:  Ou  tpu  t  -  finbedd  i  ng 

(((FXFO-SIUNERATION  2)  (ENUNERATE-CIS  2))) 

:L-R-Link  INPLOfSTTATION 
:Doe 

(•enumerates  the  P.FO  queue  -A,  which  is  implemented  as  a  circular  - 
Indexed  Sequence.  The  queue  is  not  changed.  The  queue  grows  -A.* 
(XNPUT-PORT-NANE>  (DOC-BP>  (FIFO-DIUNSRATION  1))) 

(GROWTH-DIRECTION  (N>  FIPO-SKMBRATION) } ) ) 


(Defrule  CIS-ADD 

•Circular-lndexed-Sequence  Add* 

:RHS-Node-Types 

((FULL?  .  CIS-FULL) 

(ROOMY-ADD  .  ROOMY -CIS -ADD) 
(MAKE-ROOM  .  GROW-CIS)  ) 
:Bdge-List 

(((MAKE-ROOM  2)  .  ( ROOMY -ADD  2) ) ) 
:  I  npu  t  -  bbedd  i  ng 
(((CIS-ADD  1}  (ROOMY -ADD  1)) 
((CIS-ADD  2)  (MAKE-ROOM  1)) 
((CIS-ADD  2)  (ROOMY-ADD  2) ) 
((CIS-ADD  2)  (FXn«L?  1)}) 
:Output-bibeddi  ng 
(((CIS-ADD  3)  (ROOMY-ADD  3))) 
:L-R-Link  COMPOSITION 


(•adds  the  element  -A  to  the  Circular-lndexed-Sequence  -A,-l- 
meking  room  for  it  if  the  Circular-lndexed-Sequence  ia  full.-t- 
Hie  sequence  is  filled  in  -A.* 

(IHI>Vr-PORT'NANB>  (DOC-BP>  (CIS-AOO  1)}) 

(INPUT-PORT-NAME>  (DOC-BP>  (CIS-ADD  2))) 

(GROWTH-DIRECTION  (N>  CIS-ADD) )  ) ) 

(Defrule  ROOMY-CIS-ADD 

•Rooiv  Circular-lndexed-Sequence  Add* 

1  RHS -Node -Type  e 
( (ADD-TO-DATA  .  NBW-TIRM) 
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(BUMP'LAST  .  INCREMSIT-OR-DECRENENT) 

(HRAP‘XNDEX>AROUND  .  NOD) 

(INCREMEMT-FILL’COUNT  .  INCREMENT)) 

:Eclg«’Li«t 

(((BUMP-LAST  2)  .  (WRAP- INDEX -AROUND  1)}) 

: Input-Enb«dding 

( ( (ROOMY-CIS-ADD  1)  (ADD-TO-DATA  1)) 

( (ROOMY-CIS-ADD  2)  (ADD-TO-DATA  3) 

BASE) 

((ROOMY-CIS-ADD  2)  (WRAP- INDEX -AROUND  2) 

SIZE) 

((ROOMY-CIS-ADD  2)  (INCROIENT-FILL-COUNT  1) 

FILL-COUNT) 

((ROOMY-CIS-ADD  2)  (BUMP-LAST  1) 

LAST) 

((ROOMY-CIS-ADD  2)  (ADD-TO-DATA  2) 

LAST) ) 

:  Ou  t  pu  t  -  EnJMdd  i  ng 

(((ROOMY-CIS-ADD  3)  (WRAP-INDEX-AROUND  3) 

LAST) 

((ROOMY-CIS-ADD  3)  (INCREMENT-PILL-COUNT  2) 

PILL-COUNT) 

((ROOMY-CIS-ADD  3)  (ADD-TO-DATA  4) 

BASE) ) 

:St-ThrU8 

(( (ROOMY-CIS-ADD  2)  (ROOMY-CIS-ADD  3) 

SIZE) 

((ROOMY-CIS-ADD  2)  (ROOMY-CIS-ADD  3) 

FIRST) ) 

:L-R-Link  COMPOSITION 
:Doc 

('adds  the  element  -A  to  the  Circular-lndexed-Sequence  ~A,  - 
(vriiich  has  room  for  it) 

The  sequence  is  filled  in  ‘>A.* 

(INPUr-PORT-NAME>  (DOC-BP>  (ROOMY-CIS-ADD  1))) 
(INPUT-PORT-NAME>  (DOC-BP>  (ROOMY-CIS-ADD  2) ) ) 
(GROWTH-DIRECTION  (N>  ROOMY-CIS-ADD) ) ) ) 

(Defrule  PIPO-ENQUEUE 
'FIFO  Enqueue* 

: RHS -Node -Types 
( (ADD-TO-CIS-LAST  .  CIS-ADD) ) 

: Input-Embedding 

( ( (FIFO-EUQUEUE  1)  (ADD-TO-CIS-LAST  1)) 

( (PIFO-EMQUEUE  2)  (ADD-TO-CIS-LAST  2) 
CIRCULAR-INOEXED-SBQUQ)CE>PIFO)  ) 

: Output -Embedding 

(((FIFO-ENQUEUE  3)  (ADD-TO-CIS-LAST  3) 
ClRCULAR-INDEXED-SEQUe»lCE>FIFO)  ) 

:L-R-Link  IKPLEMQFTATION 
:Doc 

('enqueues  -A  on  the  FIFO  queue  -A,  idtich  is  implemented  as 
a  Circular  Indexed  Sequence.-!- 
The  queue  grows  ~A.* 

( INPUT-PORT-NAKE>  (DOC-BP>  (FIFO-ENQUEUE  1))) 

( IMPOT-PORT-NAKE>  (OOC-BP>  (FIFO-OIQUEUE  2))) 
(GROWTH-DIRECTION  (N>  FIFO-Q4QUEUE}  ) )  ) 

;;;  Figures  3-24/  4-11. 

(Defrule  CIS-EXTRACT 

'Circular-lndexed-Sequence  Extract* 

:  RHS  -Node  -IVP** 

(  (ACCESS-BASE  .  SELECT-TERM) 

(BUMP-FIRST  .  INCREMENT-OR-DECREMENT) 

(WRAP-AROUND- INDEX  .  MOO) 

(DECREMSfT-FILL-COUWr  .  DBCREKB4T) ) 

:Edge-List 

(( (BUMP-FIRST  2)  .  (WRAP-AROUND- INDEX  1))) 

:  Input-Gsbedding 

(((CIS-EXTRACT  1)  (BUMP-FIRST  1) 

FIRST) 

((CIS-EXTRACT  1)  (ACCESS-BASE  2) 

FIRST) 

((CIS-EXTRACT  1)  (ACCESS-BASE  1) 

BASE) 

((CIS-EXTRACT  1)  (WRAP-AROUND-INDEX  2) 

SIZE) 

((CIS-EXTRACT  1)  (DECREMEOT-FILL-COUNT  1) 

FILL-COUNT) ) 

:  Ou  t  pu  t  -  Bibedd  2  ng 

(( (CIS-EXTRACT  2)  (ACCESS-BASE  3) ) 

((CIS-EXTRACT  3)  (WRAP-AROUND-INDEX  3) 

FIRST) 

((CIS-EXTRACT  3)  (OECRBIENT-FILL-COUNT  2) 

FILL-COUNT) ) 

:8t-Thrus 

(((CIS-EXTRACT  1)  (CIS-EXTRACT  3) 

LAST) 

((CIS-EXTRACT  1)  (CIS-EXTRACT  3) 

SIZE) 

( (CIS-EXTRACT  1)  (CIS-EXTRACT  3) 

BASE)) 


:L-R-Linlc  COMPOSITION 
:Doc 

('extracts  the  First  elesienc  from  the  Circular  Indexed-Sequence 
The  sequence  is  filled  in  -A.* 

(lNPUT-PORT-NAME>  (DOC-BP>  (CIS-EXTRACT  1))) 

(GROWTH-DIRECTION  {N>  CIS-EXTRACT)))) 

; ; ;  Figure  4-12 . 

(Defrule  FIFO-DEQUEUE 
•FIFO  Dequeue* 

: RMS -Node -Type  s 

((EXTRACT-CIS-PIRST  .  CIS-EXTRACT)) 

: Input -Embedding 

(((FIFO-DEQUEUE  1)  (EXTRACT-CIS-FIRST  1) 
CIRCULAR-lNDEXEO-SEQUE14CE>FIFO]  ) 

: Ou tpu t - Enbedd i ng 

(((FIFO-DEQUEUE  2)  (EXTRACT-CIS-FIRST  2) } 

((FIFO-DEQUEUE  3)  (EXTRACT-CIS-FIRST  3) 
ClRCULAR-INDfiXED-SEQUENCE>FIFO) i 
:L-R-Link  IMPLEMENTATION 
:Doc 

('dequeues  the  FIFO  queue  -A.  which  is  implemented  as  a  Circular  - 
Indexed-Sequence . 

The  queue  grows  -A.' 

(IMPUT-PORT-NAME>  (DOC-BP>  (FIFO-DEQUEUE  1))) 

(GROWTH-DIRECTION  (N>  FIFO-DEQUEUE)))) 

(Defrule  EVALUATE-ARGUMSNTS 
*  Eva  lua  t  e  - ArguBien  1 8  * 

: RHS-Node -IVP^ e 

((EVAL-EXPS  .  BNUM-EVAL-COLLECT) ) 

: 1 nput -Embeddi ng 

(((EVALUATE -ARGUMENTS  1)  (EVAL-EXPS  1)) 

((EVALUATE-ARGUMENTS  2)  (EVAL-EXPS  2)) 

((EVALUATB-ARGUMSNTS  3)  (EVAL-EXPS  3)) 

( (EVALUATE-ARGUMOrrs  4)  (EVAL-EXPS  4))) 

: Output - Embedd i ng 

(((EVALUATE-ARGUMENTS  5)  (EVAL-EXPS  5)) 

((EVALUATE-ARGUMENTS  6)  (EVAL-EXPS  6)) 

((EVALUATE-ARGUMENTS  7)  (EVAL-EXPS?)) 

( (EVALUATE-ARGUMENTS  8)  (EVAL-EXPS  8))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(•evaluates  the  arguments  -A.* 

(IMPUT-PORT-NAME>  (DOC-BP>  (EVAL-EXPS  1))))) 

(Defrule  QiUM-CVAL-COLLECT 

•Enumerate.  Evaluate,  and  Collect* 

: RHS -Node -Type s 
((ENUHERATE'ARGS  .  LE) 

(EVALUATE-THEM  .  EVALUATE-NAP) 

(COLLECT-RESULTS  .  CONS-ACCUMULATE-UP) ) 

:Edge-Liat 

(( (ENUMERATE- ARCS  2)  .  (EVALUATE-NAP  1})) 

:  Input  -aibeddi  ng 

(((ENUM-EVAL-COLLECT  1)  ( DIUMERATE-ARGS  1)) 

( (ENUN-EVAL-COLLECT  2)  (EVALUATE-NAP  2) ) 

((QXM-EVAL-COLLECT  3)  ( EVALUATE-NAP  3)) 

( (BNUM-EVAL-COLLECT  4)  (EVALUATE-NAP  4)  }  ) 
lOutput-fiRbedding 

(((ENUM-EVAL-COLLECT  S)  (COLLECT-RESULTS  2)  ) 

((SiUN-EVAL-COLLECT  6)  (EVALUATE-NAP  6)) 

((ENUM-EVAL-COLLECT  7)  (EVALUATE-NAP  7)) 

((OWN-BVAL-COLLBCT  8)  (EVALUATE-NAP  8))) 

:L-R-Link  COMPOSITION 
:Doc 

(•enumerates  the  arguments  -A,  evaluates  each  one,  and  collects-!- 
the  eveluated  arguments  in  a  list,  %fhlch  it  returns.* 
(INPUr'PORT-NMIE>  (DOC-BP>  ( ENUNERATE-ARGS  1))))) 

(Defrule  EVALUATE-NAP 
•Evaluate  Nap* 

:  RHS -Node -IVpe  s 

((ITER-EVAL  .  ITERATIVE-EVALUATION)) 

:  Input-Bibedding 

(((SVALUATB-NAP  1)  (ITER-EVAL  1)) 

((EVALUATE-NAP  2)  (ITER-EVAL  2)) 

((EVALUATE-NAP  3)  (ITER-EVAL  3)) 

((EVALUATE-NAP  4)  (ITER-EVAL  4))) 

:  Ou  tpu  t  -  bibedd  i  ng 
(((EVALUATE-NAP  S)  (ITER-EVAL  S)} 

((EVALUATE-NAP  6)  (ITER-EVAL  6)) 

((EVALUATE-NAP  7)  (ZTBR-EVAL  7)) 

((EVALUATE-NAP  8)  (ITER-EVAL  6) ) ) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

("applies  the  function  EVALUATE  to  each  expression  in  the  input  - 
seriee  of  expressions.*)) 

(Defrule  XTBRATlVE-BVALUATtON 
•Iterative  Bveluetion* 

:  RMS -Node -Types 
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( (MAP-EVAL  .  EVALUATE) ) 

: I nput  >  Qnbeddi ng 
(  (  (ITERATIVE'EVALUATION  1) 
(  lITBRATIVE>EVALUATION  2) 
((ITERATIVE-EVALUATION  3) 
((ITERATIVE-EVALUATION  4) 
:  Out  put  -  Biib«ddi  ng 
(((ITERATIVE-EVALUATION  5) 
:St-Thru8 

(((ITERATIVE-EVALUATION  4) 
((ITERATIVE-EVALUATION  3) 
((ITERATIVE-EVALUATION  2) 
:L-R-Link  COMPOSITION 
:0oc 

('iteratively  applies  the 


(MAP-EVAL  D) 

(MAP-EVAL  2)  ) 

(MAP-EVAL  3) ) 

(MAP-EVAL  4) )  ) 

(MAP-EVAL  S))) 

(ITERATIVE-EVALUATION  8)) 
(ITERATIVE-EVALUATION  7]) 
(ITERATIVE-EVALUATION  6))) 


function  Evaluate.*)) 


(Defrule  RUNNING-STATUS? 

'Execution  Still  Running  Predicate* 

: RHS -Node-Types 

((STATUS-RUNNING?  .  RUNNING-TEST]) 

: Input-Enbedding 

(((RUNNING-STATUS?  1)  (STATUS-RUNNING?  1) 

STATUS) ) 

:L-R-Lin)c  TEMPORAL-ABSTRACTION 
:Doc 

(‘checks  iirtiether  the  execution  context  -'A  is  still  running  - 
by  looking  at  its  STATUS  part.* 

(INPUT-PORT-NAME>  (DOC-BP>  (STATUS-RUNNING?  1))))) 


(Defrule  RUNNING-TEST 
•Running  Test* 

: RHS -Node-Types 

((RUNNING?  .  COMIUTATIVE-BIHARY-PUNCTION) 
(RUN-SPLIT  .  NULL-TEST)) 

:Edge-List 

(((RUNNING?  3)  .  (RUN-SPLIT  1))) 

: Input-Bsbedding 

(((RUNNING-TEST  1)  (RIBMING?  1))) 

:L-R-Link  COMPOSITION 
:Doc 

('checks  whether  -A  -A  "A.* 

(INPUT-PORT-NAKE>  (OOC-BP>  (RUNNING?  1))) 
(FUNCTION-TYPE  (FUNCTION- INFO  (N>  RUNNING?))] 
(SOURCE-TYPE  (OOC-BP>  (RUNNING?  2))))) 


(Defrule  HANDLE-MESSAGE 


•Handle  Message* 

: RHS -Node -Types 

((PROCESS  .  LOOKUP-AND-EXBCUTE-HANDLER) ) 
: 1 npu t - Bmbedd i ng 

(((HANDLE-MESSAGE  1)  (PROCESS  1)) 

( (HANDLE-MESSAGE  2]  (PROCESS  2)) 
((HANDLE-MESSAGE  3)  (PROCESS  3))) 

:  Out  put  -  mbeddi  ng 


(( (HANDLE-MESSAGE  4)  (PROCESS  6) ) 
((HANDLE-MESSAGES)  (PROCESS?))) 
:L-R-Link  IMPLEMENTATION 


:Doc 

(•handles  the  Message  -A  by  looking  up  its  handler  code  and 
executing  it.* 

(INPUr-PORT-NAME>  (DOC-BP>  (HANDLE-MESSAGE  1))))) 


(Defrule  LOOICUP-HANDLIR-FOR-MESSAGE 
•Lookup  Message  Handler* 

:  RHS -Node-TVP** 

(  (LOOKUP-MANDLER-OF-TYPE  .  LOOKUP-HANDLER) ) 

: Input-tobedding 

(( (LOOKUP-HANDLER-FOR -MESSAGE  1)  (LOOKUP-HANDLER-OF-TYPS  1) 
TYPE)) 

:  Output -to^edding 

{  ( (LOORUP-HANDLER-FOR-MESSAGE  2)  ( LOOKUP-HANDLER -OF-TYPE  2))) 

:L-R-Link  INPLBIBFrATiaN 

:DOC 

(•looks  up  the  handler  for  aeesege  -A's  type  -A.* 
(IlllVr-POirT-IIAMB>  (DOC-BP>  (LOOKUP-HANDLER-POR-MESSA6E  1))) 
(XNfCFr-PORT-IIAME>  (DOC-BP>  (LOOKUP-HANDLER-POR-MBSSAGE  1) 
TYPE) ) ) ) 


(Defrule  LOOKUP-HANDLER 
*(.>okup  Handler* 

:  RHS-itods-Types 

(  (ASBOCIATE-HANDLER-NANE  .  ASSOCIATIVE-SET-LOOKUP) ) 
: input-EMbedding 

( ( (LOOKUP-HANDLER  1 )  (ASSOCIATE-HANDLER -NAME  1 ) ) ) 

:  Ou  t  pu  t  -  tobedd  i  ng 

(((LOOKUP-HANDLER  2)  (ASSOCIATE-HANDLER-MANE  3))) 

:L-R-Link  IMPLOOBfrATlON 

:DOC 

(•looks  up  the  handler  nssisd  •'A.-l- 
The  global  essoeiatlve  set  of  operators  is  -'A.* 

( XNWr-FORT-MAME>  (DOC-BP>  (LOOKUP-HANDLER  1})) 
(SOURCE-TYPE  (P>  (ASSOCIATE-HANDLER-NANE  2) ) ) ) ) 


(Defrula  LOOKUP-HANDLER 
•Lookup  Handlar* 

: RHS -Node-Types 

( ILOOKUP-HANDLER-PROPERTY  .  PROPERTY-LIST-LOOKUP) ) 
:  I  nput  -  Otbeddi  ng 

(((LOOKUP-HANDLER  1)  (LOOKUP-HANDLER-PROPERTY  1)}J 
:Output -bibedd)  ng 

(((LOOKUP-HANDLER  2)  (LOOKUP-HANDLER -PROPERTY  3))) 

:L-R-Link  IMPLEMENTATION 

:Doc 

(•looks  up  the  handler  named  -A.* 

(INPt7r-PORT-NAME>  (DOC-BP>  (LOOKUP-HANDLER  1))))) 

(Defrule  FETCH-OP 
•Fetch  Operator* 

: RHS -Node -Types 

((LOOKUP-OP  .  ASSOCIATIVE-SET-LOOKUP)) 

:  I  nput  -  bibedd  i  ng 
(((PETCH-OP  1)  (LOOKUP-OP  1))) 

:  Output -fiibedd  i  ng 
(((FETCH-OP  2>  (LOOKUP-OP  3))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(•looks  up  the  operator  named  -'A.~%- 
ITie  global  associativa  set  of  operators  is  -A.* 
(INPUr-PMT-NAM£>  (DOC-BP>  (FETCH-OP  1))) 
(SOURCE-TYPE  (P>  (LOOKUP-OP  2))))) 


(Oafrula  FETCH-OP 
•Patch  Operator* 

:  RHS -Node-Types 

((THB-PLIST-LOOKUP  .  PROPERTY-LIST-LOOKUP)) 

:  Ii^ut  -Embedding 

(((FETCH-OP  1)  (THE-PLIST-LOOKUP  1))) 

:  Output  -  tabadd  i  ng 

(((FETCH-OP  2)  (THB-PLIST-LOOKUP  3))) 

:L-R-Link  INPLSIENTATION 

:Doc 

(•looks  up  the  aerator  named  -A.* 
(INFt7r-FORT-NAME>  (DOC-BP>  (FETCH-OP  1))))) 


(Defrule  FETCH-AND-APPLY-OPBRATOR 


•Fetch  and  Apply  Operator* 

:RHS-Noda-Typaa 
( (GET-OPSRATOR  .  FE?PCH-OP) 

(APPLY -OPERATOR  .  APPLY)} 

: Edge-List 

(((GET-OPERATOR  2)  .  (APPLY-OPERATOR  1))) 


:  Xnput-BNsedding 
( ( (FETCH-AND-APPLY -OPERATOR 
( (FETCH-AND-APPLY-OPBRATOR 
( (FETCH-AND-APPLY -OPERATOR 
( (FETCH-AND-APPLY -OPERATOR 
(  (FETCH-AND-APPLY-OPBRATOR 
:  Ou  t  pu  t  -  Embedd  i  ng 
( ( (FETCH-AND-APPLY-OPBRATOR 
( (FFTCH-AND-APPLY -OPERATOR 
( (FETCH-AND-APPLY-OPBRATOR 
( (FETCH-AND-APPLY -OPERATOR 


1) 

(GET-OPSRATOR  1)) 

2) 

(APPLY-OPERATOR 

2)) 

3) 

(APPLY-OPERATOR 

3)) 

4) 

(APPLY-OPERATOR 

4)) 

S) 

(APPLY-OPERATOR 

S))) 
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(APPLY-OPERATOR 

6)) 

7) 

(APPLY-OPERATOR 

7)) 

6) 

(APPLY-OPERATOR 

8)) 

9) 

(APPLY-OPERATOR 

9))) 

:L-R-Link  COMPOSITION 


:Doc 

(•fetches  the  operator  associated  w/  •'A  and  applies  it  to  the-l- 
evaluated  arguments  -A.* 

(INPUT-PORT-NAMB>  (DOC-BP>  (FETCH-AND-APPLY -OPERATOR  1))) 
(INPUT-PORT-NANE>  (DOC-BP>  (FETCH-AND-APPLY -OPERATOR  2))))} 


(Defrule  bvaluate-and-apply 
•Eveluate  Argwents  and  Apply  Operator* 
tRHS-Noda-lVP** 

((EVAL-AROS  .  EVALUATE-ARGUNSITS) 
(APPLY-OP  .  FETCH-AND-APPLY -OPERATOR) ) 
:Edge-List 

(((BVAL-ARGS  8)  .  (APPLY-OP  5)) 

((EVAL-ARGS  7)  .  (APPLY-OP  4)} 

((EVAL-AROS  6)  .  (APPLY-OP  3)) 

((BVAL-ARGS  5)  .  (APPLY-OP  2})) 


:  If^t-ttibedding 
(((EVALUATB-AND-APPLY  1) 
( (EVALOATS-AND-APPLY  2) 
((EVALUATB-AND-APPLY  3) 
((EVALUATB-AND-APPLY  4) 
((EVALUATB-AND-APPLY  5) 
:  Output -tobaddi  ng 
(((CVALUATE-AND-APPLY  8) 
((SVALUATB-AND-AFPLT  7) 
((EVALUATB-AND-APPLY  8) 
((EVALUATB-AND-APPLY  9) 
:L-R-Link  COMPOSITION 


(APPLY-OP  1)) 
(BVAL-ARGS  1)) 
(BVAL-ARGS  2)) 
(BVAL-ARGS  3}} 
(BVAL-ARGS  4) ) } 

(APPLY-OP  6)) 
(APPLY-OP  7)) 
(APPLY-OP  8)} 
(■APPLT-OP  9))) 


:Doc 

(•evaluates  the  argusMnts  -A,  fetches  the  operation  -a  and  applias-4- 
it  to  the  evaluated  arguments . • 

(XNPUr-POKT-NAME>  (DOC-BP>  ( EVALUATt-AND-APPLY  2))) 

(lNm-PORT-NAMl>  (DOC-BP>  (EVALUATB-AND-APPLY  1))))) 
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(D«fruU  lOTERPRET-INSTRUCTtOM 
'Interpret  Instruction* 

: RHS~Mod«-Typ«S 

({EVAL-APPLY  .  EVALUATB-AND-APPLY) ) 

: Input-Bnbodding 

( ( (INTERPRET-INSTRUCTION  1)  (EVAL-APPLY  1) 

OP) 

I (INTERPRET-INSTRUCTION  1)  lEVAL-APPLY  2) 

ARCS) 

((INTERPRET-INSTRUCTION  2)  (EVAL-APPLY  3)) 

( (INTERPRET-INSTROCTION  3)  (EVAL-APPLY  A}) 
((INTERPRET-INSTRUCTION  4)  (EVAL-APPLY  5))) 

:  Output  -  Eznbedd  i  ng 

(((INTERPRET-INSTRUCTION  5)  (EVAL-APPLY  7)) 
{(INTERPRET-INSTRUCTION  6)  (EVAL-APPLY  8)) 

(( INTERPRET  NSTRUCTION  7)  (EVAL-APPLY  9))) 

:L-R-Link  IMPbOfENTATION 
:Doc 

('interprets  the  instruction  ~A  by  eveluating  its  arguments  - 
~A  and  applying  its  operator  -A  to  them.* 

( lNPUT-PORT-NAME>  (DOC-BP>  (INTERPRET-INSTRUCTION  1))) 
(INPOT-PORT-NAME>  (DOC-BP>  (INTERPRET-INSTRUCTION  1) 
INST-ARGS) ) 

( INPUT- PORT-NAME>  (DOC-BP>  ( INTERPRET-INSTRUCTION  1) 

INST-OP) ) ) ) 

(Defrule  LOOKUP-AND-EXECUTE-HANDLER 
'Lookup  and  Execute  Message  Handler* 

; RHS-Node-Types 

( (GET-DESTINATION-NODE  .  LOOKUP-DESTINATION) 

(LOAD-ARCS  .  LOAD-ARGUMENTS) 

(RECORD-NEW-NODE  .  RECORO-AT-OESTINATION) 

(GET-HANDLBR-CODE  .  LOOKUP-HANDLER-POR-MESSAGE) 
(GET-NEXT-INSTRUCTION  .  FETCH-INSTRUCTION) 

(INTERPRET  .  INTERPRET-INSTRUCTION) 

(STILL-RUNNING?  .  RUNNING -STATUS? ) ) 

:£dge-Lisc 

( ( (GBT-DESTINATION-NODE  3)  .  (LOAD-ARGS  2)) 

((LOAO-ARGS  3)  .  (INTERPRET  3}) 

((LOAD-ARGS  3)  .  (RECORD-NEW-NODE  1)) 

( (RECORD-NEW-NODE  4)  .  (INTERPRET  2) ) 

( (GET-HANDLER-CODE  2J  .  (INTERPRET  3) ) 

((GET-HANDLER-CODE  2)  .  (GET-NEXT-INSTRUCTION  2) ) 

( (GET-NEXT- INSTRUCTION  4)  .  (INTERPRET  3)) 

( (GET-NEXT-INSTROCTION  3)  .  (INTERPRET  1)) 

((INTERPRET  6)  .  (STILL-RUNNING?  1))) 

: Input-Embedding 

(((LOOKUP-AND-EXECUTE-HANDLER  1)  (RECORD-NEW-NODE  2) > 
((LOOKUP-AHD-EXECtn'E-HANDLER  1)  (LOAD-ARGS  1)) 

(  (LOOKUP-AND-EXECUTE-HANDLER  1)  (GET-DESTIHATION-NODE  2)) 
((LOOKUP-AND-EXECUTE-HANDLER  1)  (GET-HANDLER-CODE  1)) 
((LOOKUP-AND-EXECUTE-HANDLER  2)  (RECORD-NEW-NODE  3)) 
((LOOKUP-AND-EXECUTE-HANDLER  2)  (GET-DESTINATION-NOOE  1)) 
((LOOKUP-AND-EXECUTE-HANDLER  3)  (INTERPRET  4)) 
((LOOKUP-AND-EXECUTE-HANDLER  4)  (GET-NEXT-INSTRUCTION  1)) 
((LOOKUP-AND-EXECUTE-HANDLER  5)  (INTERPRET  3)}) 

:  Ou  t pu t - Embedd i ng 

(  (  (LOOKUP-AND-CXECUTE-KANDLER  6)  (INTERPRET  5}) 
((LOOKUP-AND-EXECUTE-HANDLER?)  (INTERPRET?))) 

:L-R-Link  COMPOSITION 
:Doc 

('looks  up  the  handler  for  the  message  loads  the  - 
arguments  of  the  jnssssge  into  the  message's  destination  - 
node,  and  then  executes  the  handler  instructions,  starting  - 
with  the  one  pointed  to  by  -A.  As  long  as  the  execution  - 
context's  status  is  -A,  the  next  instruction  (pointed  to  - 
by  ~A)  is  executed.' 

( INPUT-PORT-NAKE>  (DOC-BP>  (LOOKUP-AND-EXECUTE-HANDLER  1))) 
{lNPUT-PORT-NAIfE>  (DOC’BP>  (LOOKUP-AND-EXECUTE-HANDLER  4))) 
(INPUr-PORT-NAME>  (DOC-BP>  (LOOKUP-AND-EXECUTE-HANDLER  S))) 

( INPUT- PORT-NAMB>  (DOC-BP>  (LOOKUP-AND-EXECUTE-HANDLER  4))))) 

(Defrule  FETCH-INSTRUCTION 
'Fetch  Next  Instruction* 

:RHS-Node-TyP«R 

((FETCH-11  .  INDEXED-SBOUENCE-EXTRACT) ) 

: Output -Bmbeddi ng 

(((FETCH-INSTRUCTION!)  (FETCH-Il  2)) 

( (FETCH- INSTRUCTION  4)  (FETCH-II  3))) 

:L-R-Link  COMPOSITION 
jD^c 

('fetches  the  next  instruction  (pointed  to  by  -A)  in  the  - 
sequence  -A' 

(rNPUT-PORT-NANE>  (DOC-BP>  (PETCH-INSTRUCTION  1))) 

( 1MP0T-P0RT-NAMB>  (DOC-BP>  (FRCH- INSTRUCTION  2)  )))  ) 

(Defrule  LOAD-ARGUMBITS-lNrO-NBNORy 
'Load  Arguments  into  Mesiory* 

:RH$-Node-Typ«R 

(  (TRANSFBR-ARO-LIST  .  LIST-TO-SBQUENCI) 

(ADD-TO-NMORT  .  ASSOCIATlVB-SET-AK»  ) 

:Edge-List 

( ( (TRANSFBR-ARO-LIST  3)  .  (ADD -TO -MEMORY  1))) 


: Input -Embedding 

( ( (LOAD-ARGUMENTS-INTO-KEMORY  1)  (TRANSFER-ARC-LIST  11 
ARGUMENTS) 

( (LOAD-ARGUMlNTS-IMTO-MOiORY  1)  (TRANSFER -ARG-LIST  2> 

STORAGE-REQUI REMENTS } 

((LOAD-ARGUMOFTS-INTO-MENORY  2)  (ADD-TO-MEMORY  3))) 

: Ou t pu t -  Qnbedd 1 ng 

( ( (LOAD-ARGUMENTS-INTO-MEMORY  3)  (ADD-TO-MEMORY  4))) 

:L-R-Link  COMPOSITION 
:Doc 

('takes  the  list  of  arguments  in  the  message  ~A  and  converts  it  to  - 
an  indexed-sequence  of  size  •-A,  which  it  then  stores  in  the  memory  - 
-A,  at  key  -A.* 

( INPUT- POfiT-NAME>  (DOC-BP>  (LOAO-ARGUMBn’S-INTO-MEMORY  1) 

ARGUMENTS) ) 

(INPUT-PORT-NAME>  (DOC-BP>  (LOAD-ARGUMENTS-INTO-MEMORY  1) 
STORAGE-REQUIREMENTS)  ) 

(lNPUT-PORT-NAME>  (OOC-BP>  (LOAD-ARGUMENTS-INTO-MEMORY  2))) 
(INPUr-PORT-NAME>  (DOC-BP>  (ADD-TO-MEMORY  2))))) 

(Defrule  LOAD-ARGUMENTS-INTO-5N 
'Load  Arguments  into  Synch-Node' 

: RHS -Node -Types 

( (BASE-LOAD-ARGUMB4TS  .  LOAD-ARGUMENTS-INTO-MEMORY)) 

:  1  nput  -  Em)9eddi  ng 

I  (  (LOAD -ARGUMENTS -INTO- SN  1)  (BASE-LOAD-ARGUMENTS  1)) 
((LOAD-ARGUMQITS-INTO-SN  2)  (BASE-LOAD-ARCUMENTS  2} 

MQIORY)  } 

:  Output  -  Bsibeddi  ng 

(((LOAD-ARGUMS24TS-INTO-SN  3)  (BASE-LOAD-ARGUMEirTS  3) 

MEMORY) ) 

:St-Thrus 

( ( (LOAD-ARGUMENTS-INTO-SN  2)  (LOAD-ARGUMENTS-INTO-SN  3} 

LOCAL-BUFFER) ] 

:L-ft-Link  IMPLEMB^ATION 
:Doc 

('loads  the  arguments  of  the  Message  -A  into  the  Memory  part  of  the  • 
Node  -A-  which  is  implemented  as  a  Synch-Node.' 

(INPUT-PORT-NAMB>  (DOC-BP>  (LOAD-ARGUMENTS-INTO-SN  1))) 

{  INPUT- PORT-NAME>  (DOC-BP>  (LOAD-ARGUMEJn*S-INTO-SN  2)  ) ) )  ) 

(Defrule  LOAD-ARGUMENTS- INTO-AN 
'Load  Arguments  into  Asynch-Node* 

:RHS-Node-Tyipes 

( (BASE-LOAD-ARGUMEMTS  .  LOAD-ARGUMENTS-INTO-MEMORY)) 

: Input -Embeddi ng 

(((LOAD-ARGUMENTS-INTO-AN  1)  (BASE-LOAD-ARGUMQFTS  1)) 

( (LOAD-ARGUMENTS- INTO-AN  2)  (BASE-LOAD-ARGUMSNTS  2)  MEMORY)) 

:  Ou  t  pu  t  -  Embedd  i  ng 

(((LOAD-ARGUMENTS-INTO-AN  3)  (BASB-L0AD-ARGUMO4TS  3}  MEMORY)) 

:St-Thrus 

(((LOAD-ARGUNENTS-INTO-AN  2)  (LOAD-ARGUMENTS-lNTO-AN  3} 

TINE)) 

:L-R-Link  IMPLEMSFTATION 
:Doc 

('loads  the  arguments  of  the  Message  -A  into  the  Memory  part  of  the  - 
Node  -A  which  is  isplementad  ss  an  Asynch-Node.* 

(1NPUT-P0RT-NANE>  (DOC-BP>  (LOAD-ARGUMENTS-INTO-AN  1))) 
(XNPUT-PORT-NAME>  (DOC-BP>  (LOAD-ARCUNENTS-XNTO-AN  2) ) ) ) ) 

(Defrule  LOAD-ARGUMENTS 
'Load  Arguments* 

: RHS -Node -Typss 

((LOAD-AN  .  LOAD-ARGUMSirS-INTO-AN) ) 

:  Input-Esibedding 

(  ( (LOAD-ARGIMDITS  I)  (LOAD-AN  1)) 

((LOAD-ARGUMENTS  2)  (LOAD-AN  2) 

ASyNCH-NODE>NODE) ) 
i^tput-mbeddi  ng 
(((LOAD-ARGKBCENTS  3)  (LOAD-AN  3) 

A6yNCH-N0DE>N0DE)  ) 

:L-R-Link  IMPLSfENTATION 
:Doc 

('loads  the  arguments  of  Message  -A  into  the  memory  of  node  -A.* 

( INPUT- PORT-NANB>  (DOC-BP>  (LOAD-ARGUMENTS  1))) 

( INPUT- PORT-NANE>  (D0C-BP>  (LOAD -ARGUMENTS  2))))) 

(Defrule  LOAD -ARGUMENTS 
'Load  Arguments' 

; RHS -Node -Types 

((L0AD-8N  .  LOAD-ARGUMENTS-INTO-SN)) 

:  Ii^ut-Embedding 

(((LOAD-ARGUMENTS  1)  (LOAD-SN  1)) 

((LOAD-ARGUMENTS  2)  (LOAD-SN  2) 

SyNCH-NODB>NODE) ) 

:Output -Bibedding 
(((LOAD-ARGlBaNrS  3)  (LOAD-*SN  3) 

SyNCH-NODB>NODB)  ) 

:L-R-Link  XMPLENBITATION 
:Doc 

('loads  the  erguments  of  Message  -A  into  the  memory  of  node  -A.* 
(lNPUT-PORT-NAMB>  (DOC-BP>  (LOAD -ARGUMENTS  1))) 

( XNPUT-PORT-NMCE>  {DOC-BP>  (LOAD-ARGUMENTS  2))))) 


300 


(D«frule  FETCH4>tJPDATE 


*F«tch  and  Updata* 

: RHS *Noda -Typas 

{ {FETCH-FROM’BASE  .  SELBCT-TERM) 

(BACKUP- INDEX  .  INCREME(Fr-OR-OECREMEMT) ) 

: I npuc -tabadding 

( ( ( FETCH+UPOATE  1)  (FETCH-FRON-BASE  2)  INDEX) 
( (FETCH' UPDATE  1)  (BACKUP- INDEX  1)  INDEX) 

( (FETCH+UPDATE  1)  (FETCH-FRON-BASE  1)  BASE)) 


:Output -tabadding 

(((FBTCH-i^UPOATE  2)  (FETCH-FRON-BASE  3)) 

( (FETCH+UPDATE  3)  (BACKUP-INDEX  2)  INDEX)) 


:St-Thrus 

( ( (FBTCH+UPDATE  1)  ( FETCH+UPDATE  3)  BASE)) 
:L-R-Lin)c  CONPOSITION 


:Doc 

(■axtracta  an  alanant  from  an  Indaxad-Saquanca,  %diich  haa 
parts :-%~ 

Base  (an  saquanca)  '-A,-%- 

and  an  Indax  -A  into  tha  saquanca. -%~ 

Tha  saquanca  is  fillad  in  -A.  Tha  Indax  is  updatad  aftar  - 
tha  output  is  fatchad  from  tha  Basa.* 

(INPUT-PORT-NANE>  (DOC-BP>  ( FETCH-fUPOATE  1)  BASE)) 
(INPUT-PORT-NANE>  (DOC-BP>  (FETCH^UPDATE  1)  INDEX)) 
(GROWTH-DIRECTION  (N>  FETCH-^UPDATE) ) ) } 


:Doc 

(*adds  -A  to  an  Indaxad-Saquanca.  which  haa  parts 
Base  (an  saquanca)  -A.-%- 
and  an  Indax  '-A  into  tha  saquanca. -%- 
Tha  saquanca  is  fillad  in  -A.<-%- 

Tha  Indax  is  updatad  aftar  tha  input  is  addad  to  tha  Base.* 
(INPUT-PORT-NAN£>  (DOC-BP>  (BUHP^UPDATE  1))) 
(INPirr-PORT-NAM£>  (DOC-BP>  (BUHP^UPOATE  2)  BASE)} 
(XNFUT-PORT-NAN£>  (DOC-BP>  (BUIP«UPDATE  2)  INDEX)) 
(GROWTH-DIRECTION  (N>  BUNP4UPDATE)  ) ) ) 

(Dafrula  INDEXED-SEQUENCE- INSERT 
‘Indaxad-Saquanca  Insert* 

:  RHS -Node -Type  s 
((I-5-INSERT2  .  UPDATE^BIMP) ) 
t  Input -tabadriing 

(((IKDEXED-SBQUENCB-INSEST  1)  (1-S-INSERT2  1)) 

( (INDEXED-SEQOENCE- INSERT  2)  (I-S-INSERT2  2))) 
:Output-tabaddi  ng 

( ( (INDEXED-SEQUtaCE-lNSERT  3)  (I-S-INSERT2  3))) 

:L-R-Linlc  INPLEMIMTATION 
:Ooc 

(‘inserts  -a  into  tha  Indexed  Saquanca  -A.* 

(1NPOT-P0RT-NAN£>  (DOC-BP>  (INDEXED-SEQUENCE- INSERT  1))} 

( INPUT- PORT-NAN£>  (OOC-BP>  (INDEXED-SEQUENCE- INSERT  2))))) 


(Defrule  UPDATB^FETCH 
‘Update  and  Fateh* 

:RHS-Noda-Typa8 

( (FETCH-FROH-BASE2  .  SELECT-TERN) 

( BACKUP- INDEX2  .  INCRQ(EOT-OR-DECRSiENT)  ) 


:Edge-Liat 

( ( (BACKUP-INDEX2  2)  .  (FETCH-PRON-BASE2  2))) 

:  input-tabadding 

( ( (UPDATE4FETCH  1)  (BACKUP-INDBX2  1)  INDEX) 

( (UPDATE-fFETCH  1)  (PBTCH-PR0N-BASB2  1)  BASE}) 


: Out put -tabaddi ng 

( ((UPOATE4FETCH  2)  (PETCH-FR0H-BASE2  3)) 
((UPDATE-fFETCH  3)  (BACia7P-INDEX2  2)  INDEX)) 


:St-Thrua 


(((UPDATE-fFETCH  1)  (UPDATE-fFETCH  3)  BASE)} 
:L-R-Link  CONPOSITION 


:Doc 

(‘extracts  an  alamant  from  an  Indaxad-Saquanca,  which  has  • 
parts:-%~ 

Base  (an  saquanca) 

and  an  Indax  ^A  into  tha  saquanca. *%<• 

Tha  saquanca  is  fillad  in  -A.  Tha  index  is  updatad  before  * 
tha  output  is  fatchad  from  tha  Base.* 

( INPUT-PORT-NAME>  (OOC-BP>  (UPDATE-fFETCH  1)  BASE)) 

( INPUT-PORT-NANE>  (DOC-BP>  (UPDATE-fFETCH  1)  INDEX)) 
(GROWTH-DIRECTION  (N>  UPDATtf FETCH) )) ) 


(Dafrula  UPDATE^BUNP 
‘Update  and  Bump* 

:  RHS  -Node  -Types 

(  (BUHP-INDEX  .  INCRENENT-OR-DBCRBONT} 

(ADD-TO-BASE  .  NEW-TBRH) ) 

:Edga-List 

(((BUMP-INDEX  2)  .  (ADD-TO-BASE  2) ) ) 

:  Input-tabadding 

(  (  (UPDATEfBUNP  2)  (B(MP-INDEX  1)  INDEX] 

( (UPOATE-fBUNP  2)  (ADD-TO-BASE  3)  BASE) 

( (UPDATB-fBUMP  1)  (AOO-TO-BASE  1))) 

: Output-Embedding 

((  (UPDATB-fBUMP  3)  (BUMP- INDEX  2)  INDEX) 

( (UPDATE-fBUNP  3)  (ADD-TO-BASE  4)  BASE)) 

:L-R-Link  CONPOSITION 
:Doc 

(‘adds  -A  to  an  Indaxad-Saquanca,  which  has  partaz-^* 

Base  (an  saquanca)  -A,-!- 

and  an  Indax  -A  into  tha  saquanca. 

Tha  saquanca  is  fillad  in  -A.-4- 

Tha  Indax  is  updatad  before  tha  input  is  addad  to  tha  Base.* 
{INPUT-PORT-NAIIE>  (DOC-BP>  (UPOATE-fBUNP  1)}) 

(lNPUr-PORT-NAMB>  (DOC-BP>  (UPDATBfBUMP  2}  BASE)) 
(INPUr-PORT-NANE>  (OOC-BP>  (UPDATB4BUNP  2)  INDEX]) 
(GRdFTH-DIRBCTION  (N>  UPOATBfBUNP) } ) ) 


(Dafrula  BUNP^UPDATB 
‘Bump  and  Updata‘ 

:RMS-Noda-Typas 

( (BUMP-INDBX2  .  INCREMEIfr-OR-DBCRBHBfFr) 
(ADD-TO-BA8E2  .  NBN-TBRN) ) 

:  Input-tabadding 

(((BUNP4UPDATE  2)  (ADD-TO-BA8B2  2)  INDEX) 
{ (BUMP4UPDATB  2)  (BUMP-INDXX2  1)  INDEX) 

( (BUNP-fUPDATE  2)  (AOD-TO-BA8B2  3)  BASE) 

( (BUNP4UPDATB  1)  (AOD-TO-BASE2  1))) 
t  Output -tabadding 

(((BUNPf UPDATE  3)  (B0ICP-INDEX2  2)  INDEX) 

( (BUNPf UPDATE  3)  (AOD-TO-BA8B2  4)  BASE)) 
:L-R-Llnlc  COMPOSITION 


(Dafrula  INDBXED-SBQUENCE-INSERT 
‘Indaxad-Saquanca  Insert* 

:  RHS -Node -Types 
((I-S-INSBRTl  .  BUMP4UPDATB) ) 

:  Input-tabadding 

( ((INDEXED-SEQUENCE- INSERT  1)  (I-S-INSERTl  1)) 

((INDBXBD-SBQUSICE- INSERT  2)  (I-S-INSBRTl  2))) 

:  Output  -  Bmbadd  i  ng 

(((INDEXED-SEQUENCE- INSERT  3)  (I-S-INSBRTl  3))) 

:L-R-Lin)c  IMPLEMENTATION 
:Ooc 

(‘inserts  -a  into  tha  Indexed  Saquanca  -A.* 

(INPUr-PORT-NAME>  (DOC-BP>  (INDEXED-SEQUENCE- INSERT  1))) 
(INPUT-PORT-NAME>  (DOC-BP>  (INDEXED-SEQUENCE- INSERT  2))))) 

(Dafrula  INDBXED-SBQUBNCE-BXTRACT 
‘Indaxad-Saquanca  Extract* 

:  RHS -Node -Types 

((1-S-B3CTRACT2  .  UPDATE-fFETCH)) 

:  Input-tabadding 

(((INDBXBD-SBQUtaCE-EXTRACT  1)  (I-8-BXTRACT2  1))) 

:Output-BBd>adding 

(((INDBXED-SBQUBNCE-BXTRACT  2}  (I-S-BXTRACT2  2)) 
((INDBXSD-$EaUBNCB-BXTRA(7r  3)  (I-S-EXTRACT2  3))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(‘extracts  tha  currant  alamant  from  tha  Indexed  saquanca  -A.‘ 
(lNPl7r-PORT-NANE>  (DOC-BP>  ( 1NDEXED-8EQUENCE-EXTRACT  1))))) 

(Dafrula  INDBXED-SEQUQICE-EXTRACT 
‘Indaxad-Saquanca  Extract* 

: RHS -Node -Type s 

( (I-S-BXTRACTl  .  FSTCH4UPDATB) ) 

:  Xi^t-tabadding 

( ( (IKDEXED-SEQUSICB-EXTRACT  1)  (I-S-EXTRACTl  1))) 

: out put - tabaddi ng 

(((INDEXBD-SBQUBNCB-BXTRACT  2}  (I-S-EXTRACTl  2) ) 
((INDEXED-SEOUDICE-EXTRACT  3)  (I-S-EXTRACTI  3))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(‘extracts  tha  currant  alamant  from  tha  Indexed  saquanca  -A.* 
(XNnFr-PORT-NANB>  (DOC-BP>  (INDKXED-SBQUSCE-EXTRACT  1))))) 

(Dafrula  INDBXED-SBQUDICB-ACCONULATiaN 
‘Indaxad-Saquanca  Accumulation* 

:  RHS -Node -lypas 

(( INSERT- INTO- 1 -S  .  INDEXED-SEQUENCE- INSERT) ) 

:  Input-tabadding 

(((INDBXED-SIQUBICS-ACCUHULATION  1)  (INSERT- INTO- 1-S  1)) 
((INDBXSD-SnUBNCB-ACCUMULATlON  2)  (INSBRT-INTO-I-S  2))) 

:8t-Thrus 

(((INDSXBD-SBQOBICS-ACCDMULATIGN  2)  (INDBXED-SBQUENCB-ACCtBIULATlON  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 

;Doc 

(‘aceioiulstas  tha  alamants  in  tha  series  into  s  new  indexed -sequence.* 
(lNPUr-PORT-NANE>  (DOC-BP>  (INDEXED-SBOUtaCE-ACCUNULATlON  1))))) 


(Dafrula  ASSOCIATIVB-SBT-ADD 
‘Associativa  Sat  Add* 


:RHS-Noda-Typas 

( (THE-ALI8T-IN8SRT  .  AS90CIATIVB-LIST-1N8ERT) ) 

: Input -tabadding 

(((ASSOCIATIVE-SBT-ADO  1)  (TMB-ALIST- INSERT  1 )  ) 
((ASSOCIATIVB-SBT-ADD  2)  (THB-ALIST-INSBRT  2) ) 
({ASSOCIATXVB-SBT-ADO  3)  (IMB-ALIST- INSERT  3) )  ) 


;  Output -tabadding 

(  ( (ASSOCIATIVB-SBT-ADD  4)  (THB-ALIST-INSBRT  4))) 


301 


rL-R-Link  INPLSNENTATION 
:Ooc 

(*ins«rt«  -A  (•••oci«c«d  w/  k«y  •'A)  in  th«  «s«oci4tiv*  Mt  -A.- 
An  •ItMnt  X  occurs  bsfors  anothsr  Y  if  X's  ksy  -^A  Y's  ksy.  > 
An  slsaisnt  X  rsplacss  Anothsr  Y  if  X's  ksy  -A  Y's  ksy.” 
(INPUT-PORT-NAME>  (DOC-BP>  (ASSOCIATIVS-SET>ADD  1))) 

( INPUT-PORT -NAKE>  (DOC-BP>  (ASSOCIATIVE-SBT’ADD  2 ) } ) 

( INPOT-PORT -NAME>  (OOC-BP>  (ASSOCIATIVE-SET-ADD  3J)) 
(PUNCTION-NAME  (FUNCTION-TYPE 

(KBY-CONPARATOR-INFO  (N>  THE-ALIST-INSERT)) ) ) 

(FUNCTION-TYPE  (FUNCTION-TYPE 

(KEY-EQUALITY-INFO  (N>  THE-ALIST-INSERT) ) ) ) ) ) 

(Defrule  ASSOCIATIVE-SET-ADO 
*Associstiv«  Sst  Add* 

: RHS -Node -Types 

( (THE-HT- INSERT  .  HASH-INSERT)) 

: Input -Embedding 

{ ( (ASSOCIATIVB-SET-ADD  1)  (THE-HT- INSERT  11) 
((ASSOCIATIVE-SET-ADD  2)  (THE-HT- INSERT  2)) 

( (ASSOCIATIVE-SBT-ADD  3)  (THB-HT-INSERT  3))) 

: Out put - Embeddi ng 

(((ASSOCIATIVE-SET-ADD  4)  (THE-HT- INSERT  4))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(*insercs  -A  (essocieted  w/  key  -A)  in  the  associative  set  -A.- 
An  element  X  occurs  before  another  Y  if  X's  key  -A  Y's  key.- 
An  element  X  replaces  emother  Y  if  X's  key  <-A  Y's  key.* 

( INPUT- PORT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-ADD  JJ>) 
(INPUT-PORT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-ADD  2) ) ) 
(INPUT-PORT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-ADD  3))) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(REY-COMPARATOR-INFO  (N>  THE-HT- INSERT) )) ) 

(FUNCTION-NAME  (FUNCTION-TYPE 

(KBY-EQUALITY-INFO  (N>  THE-HT-INSERT) ) ) ) ) ) 

(Defrule  ASSOCIATtVE-SET-REMOVE 
*Associative  Set  Remove* 

: RHS -Node -Type s 

( (THE-ALIST-DELETE  -  ASSOCIATIVE-LIST-OELETE) ) 

: I npu t - Embedd i ng 

( ( (ASSOCIATIVE-SET-RENOVE  1)  (THE-ALIST-DELETE  1)) 

( (ASSOCIATIVE-SET-RCMOVE  2)  (THB-ALIST-OBLETE  2))) 
t  Output -Embedding 

( { (ASSOCIATIVE-SET-REMOVE  3)  (THE-ALIST-DELETE  3))J 

:L-R-Link  IMPLEMENTATION 

:Ooc 

(•deletes  an  element  associated  w/  key  -A  in  the  associative 
set  -A.  An  element  X  occurs  before  another  Y  if  X's  key  •'A  - 
Y's  key.  Keys  are  compered  using  -'A.* 

(1NP0T-P0RT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-RENOVE  1))) 
(INPUT-PORT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-RENOVE  2))) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(KEY -COMPARATOR-INFO  (N>  THE-ALIST-DELETE)))) 

(FUNCTION-NAME  (FUNCTION-TYPE 

(KBY-EQUALITY-INFO  (N>  THE-ALIST-DELETE)))))) 

(Defrule  ASSOClATIVE-SET-RDfOVB 
*Associative  Set  Remove* 

:  RHS-Node-lVpee 

( (THB-HT-DELETE  .  HASH -DELETE ) ) 

:  Input-BRbedding 

( I (ASSOCIATIVE-SET-RENOVS  1)  (THE-KT-DELETE  1)) 

( (ASSOCIATIVE-SBT-REMOVE  2)  (THE-HT-DBLETE  2))) 

:  Output-EsU>edding 

( ( (ASSOCIATIVB-SET-REMDVE  3}  (7WE-HT-DELS7S  3)>) 

:L-R-Link  INPLEKSHTATION 
:DOC 

("deletes  an  element  associated  w/  key  -A  in  the  associative  - 
set  -A.  An  element  X  occurs  before  another  Y  if  X's  key  ^A  ^ 
Y's  key.  Keys  are  compared  for  equality  using  -.A.* 
(1NPUT-P0RT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-REMOVS  1))) 
(INPOT-PORT-NANE>  (DOC-BP>  (ASSOClATlVE-SET-REMOVE  2) ) ) 
(FUNCTION-NAME  ( FUNCTION-TYPE 

(KEY-CONPARATOR-INPO  (N>  THB-HT-DELETE) )) ) 

(FUNCTION-NAME  (FUNCTION-TYPE 

(KBY-EQUALITY-INFO  (N>  THB-HT-DBLBTE)  J  J ) ) ) 

(Defrule  ASSOCIATIVE-SET-LOOKUP 
"Associative  Set  Lookup* 

: RHS -Node -Type s 

( (THE-ALIST-LOOKUP  .  ASSOCIATIVE-LIST-LOWtUP) ) 

: Input-Embedding 

(((ASSOCIATIVE-SET-LOOKUP  1)  (THB-ALXST-LOOKUP  1)) 
((ASSOCIATIVE-SET-LOOKUP  2)  (TKB-ALIST-LOOKUP  2))) 

: Output -Embeddi ng 

(( (ASSOCIATIVE-SET-LOOKUP  3)  (THB-ALXST-LOOKUP  3) ) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

("looks  up  an  element  associated  w/  key  -'A  in  the  aasociaeive  - 
set  "A.  An  element  X  occurs  before  another  Y  if  X's  key  -A  - 
Y's  key.  Keys  ere  compered  using  •'A.' 

(1NPUT-P0RT-NAME>  (OOC-BP>  (ASSOCIATIVE-SET-LOOKUP  1))) 


(INPUT-PORT-NAME>  (OOC-BP>  (ASSOCIATIVE-SET-LOOKUP  2))! 
(FUNCTION-NAME  (FUNCTION-TYPE 

(KEY-COMPARATOR-INPO  (N>  THE-ALIST-LOOKUP)  M  1 
(PUNCTION-NAME  (FUNCTION -TYPE 

(KEY-EQOALITY-INFO  (N>  THE-ALIST-LOOKUP)  )  )  )  )  ) 

(Defrule  ASSOCIATIVE-SET-LOOKUP 
•Associative  Set  Lookup* 

:RHS-Node-Type8 

( (THE-HT-LOOKUP  .  HASH-LOOKUP)  ) 

:  Input-bibedding 

(((ASSOCIATIVE-SET-LOOKUP  1)  ('THE-HT-LOOKUP  1)) 

( (ASSOCIATIVB-SET-LOOKUP  2)  (THE-HT-LOOKUP  2))) 

: Output -Embedding 

tt (ASSOCIATIVE-SET-LOOKUP  3)  (THE-HT-LOOKUP  3))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

("looks  up  an  element  associated  w/  key  -A  in  the  associative  set  -k. 
An  element  X  occurs  before  another  Y  if  X's  key  -A  Y's  key.  - 
An  element  X  is  retrieved  if  X's  key  -A  -A." 

(lNPUr-PORT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-LOOKUP  1))) 
(lNPOT-PORT-NAME>  (DOC-BP>  (ASSOCIATIVE-SET-LOOKUP  2 )) ) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(KEY-COMPARATOR-INPO  (N>  THE-HT-LOOKUP)  ))  > 
(PUNCTION-NAME  (FUNCTION-TYPE 

(KEY-EQOALITY-INFO  (N>  THE-HT-LOOKUP) )) ) 
(1NP0T-P0RT-NAME>  tDOC-BP>  (ASSOCIATIVE-SET-LOOKUP  U)))) 

(Defrule  PROPERTY-LIST-LOOKUP 
"Property  List  Lookup* 

:RHS-Node-Types 
( (GET-AT-INDICATOR  .  GET) ) 

:  Input-tebedding 

(((PROPERTY-LIST-LOOKOP  1)  (GET-AT-INDlCATOR  1)) 
((PROPERTY-LIST-LOOKUP  2)  (GET-AT-INDICATOR  2))) 

:Output-Embedding 

(((PROPERTY-LIST-LOOKUP  3)  (CET-AT- INDICA'TOR  3))) 

:L-R-Link  IMPLEMENTATION 
:Ooc 

("looks  up  the  value  associated  w/  the  indicator  -A  xn  the 
property-list  of  the  symbol  -A." 

(1NPOT-P0RT-NAME>  (DOC-BP>  (PROPERTY-LIST-LOOKUP  2)  )  ) 
(INPaT-PORT-NAME>  (DOC-BP>  (PROPERTY-LIST-LOOKUP  1))))) 

(Defrule  HASH-LOOKUP 
•Hesh  Table  Lookup" 

:RHS-Node-Typea 

((CHT-LOOKUP  .  CHAININC-HT-LOOKUP) ) 

: Input-Babedding 

(((HASH-LOOKUP  1)  (CHT-LOOKUP  1)) 

((HASH-LOOKUP  2)  (CHT-LOOKUP  2))) 

:Output -Embeddi  ng 
(((HASH-LOOKUP  3)  (CHT-LOOKUP  3))) 

:L-R-Link  IMPLSIENTATION 
:DOC 

("looks  up  an  element  with  key  -'A  from  the  Hash-Table  -A." 

( INPUT- PWT-NAMB>  (DOC-BP>  (HASH-LOOKUP  1))) 

(INPUT-PORT-NANE>  (ALL-BP>  (HASH-LOOKUP  2))))) 

(Defrule  HASH-DELETE 
"Hash  Table  Delete" 

:RHS-Node-Type8 

((CHT-DBLETE  .  CHAINXNG-HT-DELETE) ) 

:  I  t  -  tabedd  i  ng 

(((HASH-DELETE  1)  (CHT-DBLETE  1)) 

( (HASH-DELETE  2)  (CHT-DELETE  2) ) ) 

:Output -bibedding 

(((HASH-DELETE  3)  (CHT-DBLETE  3 )) ) 

:L-R-Link  IMPLEMENTATION 
:DOC 

("deletes  an  element  with  key  -A  from  the  Hash-Table  -A." 
(1NPUT-P0RT-NAMB>  (DOC'BP>  (HASH-DELETE  1))) 

(INFUT-PORT’NAMK>  (ALL-BP>  (HASH-DELETE  2) ))) ) 

(Defrule  HASH-INSERT 
"Hesh  Table  Insert" 

:RK5-Node-Types 

((CHT- INSERT  .  CHAXNXNC-HT- INSERT)  ) 

:  I  i^u  t  -  ttibedd  i  ng 

(((HASH- INSERT  1)  (CHT- INSERT  1)) 

((HASH-INSERT  2)  ( CHT- INSERT  2) ) 

((HASH-INSERT  3)  (CHT-INSBRT  3))) 

:Output -Embeddi  Ttg 
(((HASH-INSERT  4)  (CHT-INSBRT  4) )  ) 

:L-R-Link  XNPLBIIIITATION 
:DOC 

("inserts  -A  with  key  -A  into  the  Hash-Table  -A." 

(INFOT-PORT-NAMB>  (OOC-BP>  (HASH-INSERT  1))) 

(INPI7r-PORT-NAME>  (DOC-BP>  (HASH-INSERT  2))) 

(ZNPUT-PDRT-NAMR>  (ALL-BP>  (HASH-INSERTS))))) 

(Defrule  CHAININO-HT-LOOKUP 
'Cheining  Hesh  Table  Lookup* 
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: RHS -Nod* -Typ«s 

( (RETRICVE-AND-SeARCK  .  FETCH^LOOKUPM 
: Input -EUbodding 

( ( (CHAINING-HT-LOOKUP  1}  (RETRIEVE-AND-SEARCH  U) 

( (CHAININC>HT-U)OKUP  2)  (RBTRIBVS-AND-S£ARCH  2))) 

: Output -Enboddlng 

I ( (CHAINING-HT-LOOKUP  3)  (RETRZEVE-AND-SEARCH  3))) 

:L-R-Link  INPLSKENTATiaN 
:Doc 

('looks  up  an  alamant  with  kay  from  tha  chaining  - 
hash-tabla  -'A.* 

( ZNPUT-PORT-NANE>  (DOC-BP>  (CHAXNING-HT-LOOKUP  1))) 
(INPUT-PORT-NANE>  (ALL-BP>  (CHAINXN6-HT-UX>KUP  2))))) 

(Dafrula  CHAINXNG-HT-DELBTE 
■Chaining  Hash  Tabla  Dalata* 

: RHS>Noda-Typas 

(  (RBTRXEVB-AND-DELETE  .  CHAINING-HT-PILL-COUm'-DELETS) ) 

: 1 nput -Bmbadding 

( ( (CHAINXNG-HT-DELETE  1)  (RETRXEVE-AND-DELETE  1)) 

( (CHAXNXNS-HT-DELETE  2)  (RETRXEVE-AND-DELrTE  2))) 

:  Output -Bibadding 

( ( (CHAXNXNG-HT-DELETE  3)  (RETRXEVE-AND-DELETE  3))} 

:L>R-Link  IMPLENENTATXON 
:Ooc 

(■dalatas  an  alamant  with  kay  -A  from  tha  chaining  - 
hash-table  -A.* 

( INPt7T-PORT-NANE>  (DOC-BP>  (CHAINING-HT-OELETE  1))} 
(XNPUT-PORT-NAME>  (ALL-BP>  (CHAXNXNG-HT-DELBTB  2))))} 

(Dafrula  CHAININC-HT- INSERT 
■Chaining  Hash  Tabla  Insert* 

:RHS-Noda-Typas 

( (RETRXEVE-AND- INSERT  .  CHAINING-HT-PILL-COONT- INSERT) ] 

: I nput -Bmbaddi ng 

( ( (CHAINING-HT-INSERT  1)  (REniEVE-AND-lNSERT  1)) 

( (CHAININ6-HT- INSERT  2)  (RETRIEVE-AND- INSERT  2)) 

( (CHAZNIN6-HT-INSBRT  3)  'RETRIEVE-AND-INSERT  3))) 

:  Output -Babaddi  ng 

( ( (CHAXNING-HT-INSBRT  4)  (RETRIEVE-AND-INSERT  4))) 

:L-R-Link  IMPLEMQPTATIGN 
:Doc 

(•inserts  -A  with  kay  -A  into  the  chaining  Hash-Table  -A.* 
(iNPtn'-l>ORT-NAIfE>  (OOC-BP>  (CHAINING-HT-INSBRT  1))) 
(INPOT-PORT-NANE>  (DOC-BP>  (CHAININS-HT- INSERT  2))) 
(1NPUT-P0RT-NAME>  (ALL-BP>  (CHAININQ-NT-IKSERT  3) ) ) ) ) 

(Dafrula  prrcH«Loofcup 
•Patch  Bucket  and  liookup  Element* 

:  RHS -Node -types 

( (HASH-KEY -AND-SIZE  .  HASH-PWCTION) 

(GET-BUCKET  .  SELECT-TERN) 

(LOOKUP  .  ASSOCIATIVE-LIST-LOOKUP)) 

:Edga-List 

(( (HASH-KEY -AND-SIZE  3)  .  (GET-BUCKET  2)) 

((GET-BUCKET  3)  .  (LOOKUP  2))) 

: Input-Embedding 
(((PETCH-fLOOKUP  1)  (LOOKUP  1)) 

( (PETCH^LOOKUP  1)  (HASH-KEY -AND-SIZE  1)) 

((PETCH^LOOKUP  2)  (HASH-KEY -AND-SIZE  2) 

NUMBER-BUCKETS) 

((PETCH-f LOOKUP  2)  (GET-BUCKET  1) 

BUCKETS) ) 

: Output -tabaddi ng 
(((PETCH^LOOKUP  3)  (LOOKUP  3))} 

:L-R-Link  COMPOSITION 
:Doc 

(•looks  up  an  element  with  key  -A  from  the  hash-tabla  -A,  -> 
which  ia  implemented  as  an  sequence  -A  of  buckets.  Hie  - 
bucket  is  fetched  indexing  into  the  sequence  using  an  - 
index  computed  by  iqiplying  a  hash  function  to  the  key  - 
-A  and  the  number  of  buckets  in  the  hash  table  -A.-k- 
Each  bucket  is  implemented  as  an  associative  list.~%- 
Collision  resolution  is  performed  using  a  chaining  strategy.* 
(XNPUr-PORT-NMIB>  (DOC-BP>  (PCTCH^LOOKUP  1))) 

( ZNPVr-PORT-NMIB>  (ALL-BP>  (PETCN-t>LOORDP  2))J 
( IlfPOT-PORT-NANB>  (OOC-BP>  (PBTCH<fLOOKOP  2)  BUCKETS)) 
(ZKPOT-PORT-NMIB>  (DOC-BP>  (PETCH^LOORUP  1))) 

( XNPOT‘PORT-NANB>  (DOC-BP>  (PETCH^LOOKUP  2)  NOIBSR-BUCRETS) ) ) ) 

(Defrule  PBTCH+DELm 
•Petch  Bucket  and  Delate  Element* 

:  RHS -Node -Types 

((HASH-THt-REY  .  HASN-PUNCTION) 

(PETCH-BUCKET  .  SELECT-TERM) 

(REMOVE  .  ASSOCIATIVE-LIST-DELETS) 

(UPDATE-BUCKETS  .  NEM-TERM) ) 

: Edge-List 

( ( (HASH-THB-REY  3)  .  (UPDATE-BUCKETS  2)) 

{(KASH-THE-RBY  3)  .  (PETCH-BUCKET  2) ) 

((PETCH-BUCKET  3)  .  (RBI0VE2)} 

((RDIOVE  3)  .  (UPDATS-BUCRVrS  1)}) 


: Input -Embaddi ng 
(((PErCH4-DBLETB  1}  (RDIOVE  1)) 

( (PETCHOELETE  1)  (HASH-THE-REY  1)) 

((PETCHOELETE  2)  (HASH-THE-REY  2) 

NUMBER -BUCKETS) 

((PETCHoDELETE  2)  (UPDATE-BUCKETS  3) 

BUCKETS) 

( (PrrCH«DELETE  2)  (PETCH-BUCKET  1) 

BUCKETS) ) 

:Output-Embadding 

(((PETCHOELETE  3)  (UPDATE-BUCKETS  4) 

BUCKETS)} 

:St-Thrus 

(((PETCH^DELETE  2)  (PETCH^DELETE  3) 

NUMBER-BUCKETS)  ) 

:L-R-Link  COMPOSITION 
:Ooc 

(•delates  an  alamant  with  kay  -A  from  tha  hash-tabla  -A.  which  is  - 
implamantad  as  a  saquanca  -A  of  buckets.  Tha  bucket  is  fetched  by  - 
indexing  into  tha  saquanca  using  an  index  computed  by  applying  a  - 
hash  function  to  the  key  -A  and  tha  number  of  buckets  in  tha  hash  - 
table  -A.^k- 

Bach  bucket  is  implamantad  as  an  associativa  list.-%- 
Collision  resolution  is  performed  using  a  chaining  strategy.* 
(INFUT-PORT-NAME>  (DOC-BP>  (PETCH^DELETE  1)}) 

(ZNPUT-I>ORT-NANB>  (ALL-BP>  (PETCH<f DELETE  2))) 

(ZNPUT-PORT-NANE>  (DOC-BP>  ( PeTCH-»DELETE  2)  BUCKETS}) 
(INPUT-PORT-NAME>  (DOC-BP>  (P ETCH -f DELETE  1))) 

(INPUT-FORT-NAMB>  (DOC-BP>  (PETCK^DELETE  2)  NUMBER-BUCKETS)))} 

(Dafrula  PETCH-t> INSERT 

■Patch  ^ckat  and  Insert  Element  ” 

; RHS -Node -Typ* s 

((COMPOTE-HASH  .  HASH -FUNCTION) 

(PETCH  .  SELECT-TERN) 

(INSERT  .  ASSOCIATIVS-LIST-XNSBRT) 

(UPDATE  .  NEW-TERM)) 

:Bdg*-List 

( ( (CCMFUTE-HASH  3)  .  (UPDATE  2)) 

( (COMFUTB-HASH  3)  .  (PETCH  2)) 

((FETCH  3)  .  (INSERT  3)) 

((INSERT  4)  .  (UPDATED)) 

: Input -Embaddi ng 
(  ((PBTCH4>INSBRT  1)  (INSERT  D) 

((PETCH* INSERT  2)  (INSERT  2)) 

((FETCH* INSERT  2)  (COMPOTE-HASH  1)) 

((FETCH* INSERT  3)  (COMPOTE-HASH  2) 

NUMBER-BUCKETS) 

((FETCH* INSERT  3)  (UPDATE  3) 

BUCKETS) 

((FETCH* INSERT  3)  (FETCH  1) 

BUCKETS)) 

:Output  -tabaddi  ng 
(((PBTCH*INSERT  4)  (UPDATE  4) 

BUCKETS)) 

:St-T1irus 

(((PBTCH*IN8ERT  3)  (PETCH* INSERT  4) 

NUMBER-BUCKETS) ) 

:L-R-Link  COMPOSITION 
:Doc 

(•inmarts  -A  into  tha  hash-tabla  -A.  which  is  implamantad  as  a  - 
saquanca  -A  of  buckets.  Itie  bucket  is  fstchad  by  indtxing  into  -> 
tha  sequence  using  an  indsx  computsd  by  applying  a  hash  function  - 
to  the  key  -A  end  the  number  of  buckets  in  the  hash  table  ->A.-4- 
Each  bucket  is  implsmsntsd  ss  an  associative  list.-k-' 

Collision  resolution  is  performsd  using  e  chaining  strategy.* 
(INIOT-PORT-NANB>  (DOC-BP>  (PRCH*1NSERT  1}}) 

(IHPOT-PORT-NANE>  (ALL-BP>  (PETCH*IHSBRT  3) ) ) 

(INPOT-POftT-NANB>  (DOC-BP>  (PETCH* INSERT  3)  BUCKETS)) 
(INPOT-PORT-HAME>  (DOC-BP>  (FETCH* INSERT  2))) 

(INFOT-PORT-NAIIB>  (DOC-BP>  (PErCH*IHSERT  3)  NUMBER-BUCKETS)))) 

(Defrule  CHAININ6-HT-PILL-C0Uirr-0ELBrE 
•Hash  Table  with  Pill  Count  Delete* 
iRHS-Node-iypes 

( (DBLETE-ELBCBIT  .  PETCH*DSLSTB) 

(DICRlMBrr-BLT-CODNT  .  DSCREMBKT) ) 

;Hyut-Pmhedr1ing 

(((CHAININO-irr-PILL-COOMT-DBLSTB  1)  ( DELETE- BLBIBrr  1)) 

( (CHAINZNO-HT-PXLL-COUKr-DBLETE  2)  (OBLETt-ELBMBNT  2) 

HASH-TABLE) 

((CKAININ6-HT-PILL-CQUirr-DELETE  2)  (DECRBMDrT-BLT-COQKT  1) 
piLL-courr) ) 

:  output  -  Bmbeddi  ng 

(((CKAINING-HT-PILL-COUNT-DBLETE  3)  ( DELETE- BLBCBIT  3) 

HASH-TABLE) 

(iCHAlNlNO-HT-PlLL-COiaPr-DBLErE  3)  (DBCROIDrr-BLT-COUirr  2) 
piLL-coorr) ) 

:St-Tlirue 

(((CHAXNIHO-irr-PILL-COUMr-DBLETE  2)  (CMAININC-HT-PILL-COUNT-DBLETB  3) 
PILL-COUNT) ) 

SL-R-Link  COMPOSITION 
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:Doc 

An  AlMMnt  with  kAy  -A  from  th«  chnining 
HA8h-T«bl*'»Pin -Count  -A.  This  i*  •  hAsh-tsblo  which  - 
contAins  A  fill  count  -A,  koAping  trAck  of  th#  nuabor  of  - 
AlAMntA  in  th#  hAsh  tAblA.” 

tINPl7r-PORT-NAME>  (DOC-BP>  (CHAINING-HT-PILL-COUNT-DELETE  1))) 
(INPUT-PORT-NAIIE>  (ALL-BP>  (CHAXMXNG-HT-PILL-COUirr-DELETE  2) ) ) 
( INPUT -PORT -NAME>  (DOC-BP>  (CHAXNING-HT-rXLL-COONT-DELETE  2) 
PILL-COONTM  )  J 

(DAfrulA  CHAINING-HT-FILL-COUNT- INSERT 
■Hash  TAbla  with  Fill  Count  Insart* 

: RHS'Noda-TypAS 
( (ADD-ELEMEOT  .  FETCH* INSERT) 

(INCREKENT-ELT-COUNT  .  INCREMENT)} 

:  Input-Qnbadding 

((  (CHAINIMa-HT-FXLL-COUNT- INSERT  1)  {ADO-ELOCEhTT  1)) 

( (CHAINXNG-HT-FXLL-COUNT-XNSERT  2)  (ADO-ELE34EMT  2)) 

( (CHAINING-HT-FILL-COUNT-INSBRT  3)  lAOD-ELBMENT  3) 

HASH-TABLE) 

( ICHAINING-HT-FILL-COONT-INSERT  3)  ( INCREMENT-ELT-COUNT  1) 
FILL-COUNT) ) 

: Ou t pu t - Embedd i ng 

( ( {CHAXNING-HT-FILL-COUNT-XNSERT  4)  (AOD-ELCMIMT  4) 

HASH-TABLE) 

( (CHAINING-HT-FILL-COUNT-INSERT  4)  IXNCRQltNT-ELT-COUNT  2) 
FILL-COUNT) ) 

:St-Thrus 

( ( (CHAINXNG-HT-FXLL-COUNT-INSERT  3) 
(CHAINING-HT-FILL-COUNT-INSERT  4) 

FILL-COUNT) ) 

;L-R-Lin)c  COMPOSITION 
:Doc 

(■insarts  -A  with  kay  -A  into  tha  chaining  *• 
Hash-Tabla+Fill-Count  “h.  This  is  a  hash-tabla  t^ich  - 
contains  a  fill  count  -A,  kaaping  track  of  tha  numbar  of  > 
alaaants  in  tha  hash  table.* 

(1NPUT-P0RT-NAME>  (DOC-BP>  (CHAINING-HT-FILL-COONT-INSERT  1))) 
(1NPOT-P0RT-NAME>  (DOC-BP>  (CHAININO-HT-FILL-COUNT-INSERT  2) ) ) 
(INPOT-PORT-NAME>  (ALL-BP>  (CHAINING-HT-FILL-COUNT-INSERT  3))) 
lINPirr-PORT-NAME>  (DOC-BP>  (CHAINING-HT-FILL-COUNT-INSERT  3) 
FILL-COUNT) ) ) ) 

;;;  Figure  4-24. 

(Dafrula  LOOKUP-OESTtNATION 
■Lookup  Oastination  Node* 

: RHS -Node -Typas 
( (CQMPUTE-DEST  .  SELECT-TERM)) 

: Input-Qnbadding 

(((UX)KUP-OESTINATION  1)  (COMPUTE-DEST  1)) 

((LOOKUP-DESTINATION  2)  (COMPUTE-DEST  2) 

DEST-ADDR) ) 

: Output - Eabaddi ng 

(( (LOOKUP-DESTINATION  3)  (COMPUTE-DEST  3) ) ) 

:L-R-Link  COMPOSITION 
:Doc 

(■looks  up  tha  node  whose  address  is  in  tha  Dast-Addr  part  of  - 
■assaga  -A.* 

(XNPOT-PORT-HAME>  (DOC-BP>  (LOOKUP-DESTINATION  2))))) 

; ;  Figure  4-24 . 

(Dafrula  RECORD-AT-DESTINATION 

■Record  Node  at  Massage  Destination* 

:RH8-Noda-Typas 
((RBCM  .  NEH-TERM)) 

:  Input-MMdding 

( ( (RECORD-AT-DESTINATION  1 )  (RECORD  1 } ] 

( (RECORD-AT-DESTXNATXON  2)  (RECORD  2) 

DEST-AODR) 

((RECORD-AT-DESTINATION  3)  (RECORD  3))) 

:Output -ftsbaddi ng 

(((RECORD-AT-DESTINATION  4)  (RECORD  4))) 

:L-R-Link  COMPOSITION 
:Doc 

(■records  node  ••A  at  tha  address  in  the  Dast-Addr  part  of  - 
Massage  -A  in  the  address  msp  -A.* 

(INPUT-PORT-NANB>  (DOC-BP>  (RECORD-AT-DESTINATION  1))) 
(XNPUT-PORT-NANE>  (DOC-BP>  (RECORD-AT-DESTXNATXON  2)  )  ) 

( INPOT-PORT -NANB>  (DOC-BP>  (RECORD-AT-DESTINATION  3)))}) 

(Dafrula  A860C1ATJVE-L18T-L00KUP 
■Aaaoeiativa  Linked  Liat  Lookup* 

:RH8-Noda-Typaa 

( (THE-UOAL-LOOKUP  .  UNOROBRED-ASSOC-LIST-LOOKUP) ) 

: Input-tMbadding 

( (  (A8S0CIATIVE-LXST-L00KUP  1)  (TME-tX)AL-LOOKUP  1)) 
((A880C1ATXVE-LX8T-L00KUP  2)  (THE-UOAL-LOOKUP  2)  )  ) 

:  Output  -  Inbaddi  ng 

(((ASSOCIATXVE-LIST-LOOKUP  3)  (THE-UOAL-LOOKUP  3)  ) ) 

:L-R-Llnk  INPLBCBFTATXON 


:Doc 

(■looka  up  tha  alamant  aaaociatad  w/  kay  -A  -A  in  the  associativa 
list  -A.* 

(FUNCTION-NAME  (FWATTI ON-TYPE 

(REY-EQUALITY-INFO  (N>  ASSOCIATIVE-LIST-LOOKUP)))) 
(XNPOT-PORT-HAME>  lDOC-BP>  (ASSOCIATIVE-LIST-LOOKUP  1))) 
(INPOT-PORT-NAM£>  (OOC-BP>  (ASSOCIATIVE-LIST-LOOKUP  2 J  )))  i 

(Dafrula  ASSOCIATIVE-LIST-LOOKUP 
■Associativa  Linked  List  Lookup* 

: RHS-Noda -Types 

( (THE-OAL- LOOKUP  .  ORDERBD-ASSOC-LIST-LOOKUP)  I 
:  I  nput -Oibaddi  ng 

(((ASSOCIATIVE-LIST-LOOKUP  11  (THE-OAL-LOOKUP  1)) 

( lASSOClATIVE-LIST-LOOKUP  2)  (THE-OAL-LOOKUP  2»M 
:Output  -bibaddi  ng 

(  (  (ASSOCIATIVE-LIST-LOOKUP  3)  (THE-OAL-LOOKUP  3))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(■looks  up  tha  alaaant  associated  w/  kay  -A  -A  in  the  associative 
list  -A.* 

(FUNCTION-NAME  (FUNCTION-TYPE 

(KEY-BQOALITY-IMFO  (N>  ASSOCIATIVE-LIST-LOOKUP)))) 
(INPOT-PORT-NAME>  lDOC-BP>  (ASSOCXATIVE-LIST-LOOKUP  1))) 
(1MPOT-P0RT-NAME>  (DOC-BP>  (ASSOCIATIVE-LIST-LOOKUP  2l)))) 

(Dafrula  ASSOCIATIVE-LIST-DELETE 
■Associativa  Linked  List  Delate* 

: RHS -Node -Type s 

( (THE-UOAL-DELBTE  .  UNORDERED-ASSOC-LIST-DELETE) ) 

: I npu t - Enbadd i ng 

(((ASSOCIATIVE-LIST-DBLETE  1)  (TME-UOAL-DELETE  1)) 
(lASSOCIATlVE-LlST-DELETB  2)  (THE-UOAL-DELETE  2))) 

:  Output  -  bibadd  i  ng 

(((ASSOCIATIVE-LIST-DELETE  3)  (THE-UOAL-DELETE  3))) 

:L-R-Link  IMPLQCENTATION 
:Doc 

(■delates  the  elenent  associated  w/  key  -A  -A  in  the  associative  - 
list  -A.* 

(FUNCTION-NAME  (FUNCTION-TYPE 

(KEY-BQUALITY-INFO  (N>  ASSOCIATIVE-LIST-DELETE) ) ) ) 
(IMPOT-PORT-NAME>  {DOC-BP>  (ASSOCIATIVE-LIST-DELETE  1))) 
(INPOT-PORT-NAME>  (DOC-BP>  (ASSOCIATIVE-LIST-DELETE  2) ))) ) 

(DefruU  ASSOCIATIVE-LIST-DELETE 
■Associativa  Linked  List  Delate* 

: RHS -Node-Types 

( (THE-OAL-DELVTE  .  ORDERED-ASSOC-LIST-OELETE) ) 

:  Input-Esibadding 

(((ASSOCIATIVE-LIST-DELETE  1)  (THE-OAL-DBLETE  1}) 
((ASSOCIATIVE-LIST-DELETE  2)  (THE-OAL-DBLETE  2))) 
lOutput-Babaddi ng 

(((ASSOCIATIVE-LIST-DELETE  3)  (THE-OAL-DELBTE  3))) 

:L-R-Link  IMPLOCENTATION 
:Doc 

(■delates  tha  alaeant  associated  w/  kay  -A  -A  in  tha  associativa  - 
list  -’A.* 

(FUNCTION-NAME  (FUNCTION-TYPE 

(KBY-SQUALITY-INFO  (N>  ASSOCIATIVE-LIST-DELETE)))) 

( INPOT-PORT -NAMB>  (DOC-BP>  (ASSOCIATIVE-LIST-DELETE  1))) 
(XNPOT-PORT-NANB>  (DOC-BP>  (ASSOCIATIVE-LIST-DELETE  2)  )})  ) 

(Dafrula  ASSOCIATIVE-LIST-INSERT 
■Associativa  Linked  List  insert* 

:RH8-Noda-Typas 

((THE-UNORDBRBD-AL- INSERT  .  UNORDERED-ASSOC-LIST- INSERT)  } 

:  I  npu  t  -  bibadd  i  ng 

( ( lASSOCIATIVE-LIST-INSERT  1)  (THB-UNORDERED-AL-INSERT  1)) 
((ASSOCXATIVB-LXST-INSBRT  2)  (THE-UNORDBREO-AL- INSERT  2)} 
(lASSOCIATIVE-LIST-INSERT  3)  (THB-UNORDERED-AL-INSERT  3))) 

:  Output -ttibaddi  ng 

(((ASSOCXATIVE-LIST-INSERT  4)  (THE-UNORDERBD-AL- INSERT  4))) 

:L-R-Link  IMPLEMENTATION 

:DOC 

(■inserts  ^A  (sssocistad  w/  kay  -A)  in  tha  associativa  liat  -A.-l- 
An  alasiant  X  rsplacas  another  Y  if  X'm  kay  "A  Y's  key.* 
(INPOT-PORT-NANE>  (DOC-BP>  (ASSOCIATIVE-LIST- INSERT  1))) 
(INPOT-PORT-NANB>  (DOC-BP>  (ASSOCIATIVE-LIST-INSERT  2))) 
(IHFOT-PORT-NAMB>  (DOC-BP>  (ASSOCIATIVE-LIST- INSERT  3))) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(KBY-EQUALITY-INFO  {N>  THR-UNOROERED-AL- INSERT) )))) ) 

(Dafrula  ASSOCIATIVE-LIST-INSERT 
■Asaociativa  Linked  Liat  Inaart* 

:RHS-Noda-1VPMa 

((THE-OAL- INSERT  .  ORDBRBD-ASSOC-LZST-INSBRT) ) 

:  Xivut -ttibedding 

(((ASSOCIATIVE-LIST-INSERT  U  (THE-OAL- INSERT  D) 
((ASSOCIATIVE-LIST-INSERT  2)  (THE-OAL-INSERT  2)) 
((ASSOCIATIVE-LIST-INSERT  3)  (THE-OAL-INSERT  3))) 

:Oucput 'tabeddl  ng 

(((ASSOCIATIVE-LIST-INSERT  4)  (THE-OAL-INSERT  4))) 

:L-R-Link  IMPLEMBFTATION 
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:Doc 

(■inserts  (•ssoci«t«d  w/  k«y  •A)  in  th*  •••o'^iative 
list 

An  alMiant  X  raplacaa  another  Y  if  X's  kay  -A  Y'a  kay.* 
(IKPOT>PORT-NANE>  (DOC-BP>  (ASSOCIATIVE-LIST-INSERT  1))) 

( INFUT-PQRT-NANE>  (DOC-BP>  (ASSOCIATIVE-LIST- INSERT  2 ))  ) 
(INPOT-PORT'NA|fB>  (DOC-BP>  (ASSOCIATIVE-LIST- INSERT  3) )  ] 
(rUNCTION-NAME  (PUNCTION-TYPE 

IRBY-BQUALITY-INPO  (N>  ASSOCIATIVE-LIST- INSERT) )))) ) 

(Dafrula  UNOROEREO-ASSOC-LIST-LOOKUP 

*Unordarad  Aaaociativa  Linkad  List  Lookup* 

: RHS-Noda-Typas 
(  (UOAL-ENUH  .  LE) 

(FIND-ELT  .  EARLIEST-EQUAL-PRIORITY)) 

:Edga-List 

( ( (UOAL-ENUN  2)  .  (PIND-ELT  1 ) ) ) 

: I nput -tebaddi ng 

(( (UNOROEREO-ASSOC-LIST-LOOKUP  I)  (PINO-ELT  2J ) 

( (UNOROBRBO-ASSOC-LIST-LOOKUP  2)  (UOAL-EWBl  1))) 

:  Output -EBlMdding 

(( (UNOROEREO-ASSOC-LIST-LOOKUP  3)  (PIND-ELT  3) ) ) 

:L-R-Link  COMPOSITION 
:Doc 

(*saarchas  tha  alaaanta  of  tha  unordarad  associativa  list  •A  ^ 
for  an  alaaant  with  kay  -A  >A.  If  no  such  alaMnt  is  ^ 
found.  NIL  is  raturnad.* 

(INPUT-PORT-NAME>  (OOC-BP>  (UNOROEREO-ASSOC-LIST-LOOKUP  2))) 
(PUNCTION-NANE  (PUNCTION-TYPE 

(KEY-BQUALITY-INPO  (N>  UNOROEREO-ASSOC-LIST-LOOKUP) ) ) ) 

( INPUT- PORT-NAME>  (DOC-BP>  (UNOROEREO-ASSOC-LIST-LOOKUP  1))))) 

(Dafrula  UNOROERED-ASSOC-LIST-INSERT 

‘Unordarad  Associativa  Linkad  List  Insart* 

: RHS-Noda-Typas 
( (UOAL-PUSH  .  LIST-PUSH)  ) 

: Input-Bsbadding 

(( (UNOROBREO-ASSOC-LIST- INSERT  1)  (UOAL-PUSH  1)) 

( (UNOROBREO-ASSOC-LIST- INSERT  2)  (UOAL-PUSH  2))) 

:  Ou  t  pu  t  -  Oibadd  i  ng 

( ( (UNOROERED-ASSOC-LIST-INSERT  3)  (UOAL-PUSH  3))) 

:L-R-Link  IMPLEKDITATION 
:Ooc 

(‘insarts  -A  into  tha  unordarad  associativa  list  -A.* 
(INPUr-PORT-NAIfE>  (DOC-BP>  (UNOROERED-ASSOC-LIST-INSERT  1}}) 

( INPUr-PORT-NAME>  (DOC-BP>  (UNORDERED-ASSOC-LIST- INSERT  2})))) 

(Dafrula  unorderbd-assoc-list-empty? 

'Unordarad  Associativa  List  Enpty* 

: RHS-Noda-Typas 
( (UOAL-ENPTY?  .  LIST-BMPTY)  ) 

:  Input-BN»adding 

(  ( (UNOROBRBO-ASSOC-LIST-ENFTY?  1)  (UOAL-ENPTY?  1))} 

:L-R-Link  IMPLEMENTATION 
:Doc 

Ctasts  whathar  tha  unordarad  associativa  list  •A  is  asipty.* 
(INPOT-PORT-NANE>  (DOC-BP>  (UNORDEREO-ASSOC-LIST-ENPTY?  1))))) 

(Dafrula  7NTBRMEDIATE-UOAL -DELETE 

'Unordarad  Associativa  Linkad  List  Dalata  (intarsMdiata) * 
:RH8-Noda-Typa8 

( (GENBRATE-CURRSNr^NKXT-SUBLZST  .  TRAILINS-OBNERATB) 
(LIST-EXHAUSTED  .  TRUNCATE) 

(ELTS-BEPORE-P  .  TRUNCATB-EQUAL-PRIORITY-HEAD) 
(COLLBCT-RENAININS  .  CONS-ACCUMUXiATE-UP-PRQN-SUBLIST) ) 
:Edga-List 

( ( (OENERATE-CURRENT-^NBXT-SUBLIST  3}  .  (COLLECT-REMAINING  2) ) 

( (6ENERATE-CaRRENr4NBXT-SUBLIST  2)  .  (LIST-EXHAUSTED  1) ) 
((LIST-EXHAUSTED  2)  .  (ELTS-BEPORE-P  1)) 

((ELTS-BEPORE-P  3)  .  (COLLECT-REMAINING  1))) 

: I nput - tobaddi ng 

( ( ( INTBRNEDIATE-UOAL-DELETE  1)  (ELTS-BEPORE-P  2)) 

(  ( INTERNEDIATE-UOAL-DELETE  2) 

(OBMERATE-CURRBPr^NEXT-SUBLIST  1)) 

((IlfrERMBDIATS-UOAL-DBLITB  3)  (COLLECT-REMAINING  2))) 

:Output  -  tobadding 

( ( (IlfrSRMBDIATE-UOAL-OBLSTE  4)  (COLLECT-REMAINING  3) ) ) 

:L-R-Link  COMPOSITION 
;Doc 

Cintaraadiata  nontarvinal:  Unordarad-Assoc-List -Dalata. *) ) 

(Dafrula  UNORDERSD-ASSOC-LIST-OELETE 

'Unordarad  Associativa  Linkad  List  Dalata* 

:  RHS-Noda-Typas 

((SPLICE-OOT-EI/r  .  INTERMEDIATE-UOAL-DELITE) ) 

:  Input-BiA«dding 

( ( (UNOROBRED-ASSOC-LIST-DELETE  1)  (SPLICE-COT-ELT  1)) 

( (UNOROSRED-ASSOC-LIST-DELSTB  2)  (SPLICE-OOT-ELT  2))) 

:  Output -ERbaddi  ng 

(((UN0R01RID-ASS0C-LI8T-DELBTE  3)  (SPLICE-OUT-SLT  4))) 

:L-R-Link  COMPOSITION 
:0oc 

Csplicas  out  tha  alawant  of  tha  unordarad  associativa  list  > 


•>A  whosa  kay  is  -A  -A .  * 

(INPCr-PORT-NAME>  (DOC-BP>  (UNORDEREO-ASSOC-LIST-DELETE  2]  )  j 
(PUNCTIOH-NAME  (PUNCTION-TYPE 

(RBY-EQUALXTY-INPO  (N>  UNOROEREO-ASSOC-LIST-DELETE) ) ) » 
(INPUT-PORT-NAME>  (DOC-BP>  (UNOROERED-ASSOC-LIST-DELCTE 

(Dafrula  PQ-ENUMERATION 

'Priority  Quaua  EnuiMration* 

:RHS-Moda  Typaa 

((PQ-ENUM-PINISHBO?  .  FQ-EMFTY) 

(PQ-EXTRACT-NIXT  .  PQ-EXTRACTJ ) 

:  Input -bibadding 

(((PQ-ENUMBRATION  1)  (PQ-EXTRACT-NEXT  1)) 

((PQ-ENUMERATION  1)  (PQ-ENUM-PINISHED7  1))) 

:Output -bibaddi  ng 

(((PQ-ENUMERATION  2)  (PQ-EXTRACT-NEXT  2)  )  ) 

:L-R-Link  COMPOSITION 
:Ooc 

CanuRaracas  all  of  tha  alaRants  in  tha  Priority-Quaua  -A,-%- 
by  daatructivaly  axtracting  tham  CroR  tha  quaua.' 
(INPUT-PORT-NAMS>  (DOC-BP>  (PQ-ENUMERATION  1))))) 

(Dafrula  PQ-SIPTY 

'Priority  Quaua  Enpty* 

:  RHS^ioda-Typas 

((IMPTY-LXST?  .  TEST-PREDICATE)) 

:  I  r^ut  -  bbaddi  ng 

(((PQ-EMPTY  1)  (EMPTY-LIST?  1))) 

:L-R-Link  IMPLBIENTATION 


Ctasts  idiathar  tha  Priority  Quaua  -A  is  aspty.* 

(IN!Vr-P(»T-MAME>  (DOC-BP>  (PQ-EMPTY  1))))) 

(Dafrula  PQ-BXTRACT 

'Priority  Quaua  Extract* 

:  RHS-Hoda-iypas 

((EXTRACT-PROM-OAL  .  ORDERED-ASSOC-LtST-EXTIlACT)  ) 

:  I  nput  -  bibadd  i  ng 

(((PQ-EXTRACT  1)  (EXTRACT-PROM-OAL  1))) 

:Out^t -bbadding 

(((PQ-BXTRACT  2)  (EXTRACT-PROM-OAL  2)) 

((PQ-EXTRACT  3)  (EXTRACT-PROM-OAL  3))) 

:L-R-Link  IMPLflMBfTATlON 
:Ooc 

Caxtracts  tha  highast  priority  alaRant  in  tha  Priority  Quaua  -A.-l- 
Tha  priority  quaua  is  iaplanantad  as  an  ordarad  associativa  list.* 
(XNPUrr-PORT-NAME>  (DOC-BP>  (EXTRACT-PROM-OAL  1))))) 

(Dafrula  PQ-INSERT 
'Priority  Quaua  Insart* 
tRHS-Noda-TVpas 

(«»DBRBO-8PLICB-IN  .  ORDBRED-ASSOC-LIST-INSERT) ) 

:  Input -tobaddi  ng 

(((PQ-INSBRT  1)  (OROERBD-SPLXCB-IN  1)) 

((PQ-INSERT  2)  (ORDERED-SPLICE-IN  2)) 

((PQ-INSERT  3)  (ORDERED-SPLXCE-XN  3) ) ) 

:Output-bibadding 

(((PQ-INSERT  4)  (OROERED-$PLICS-ZN  4) ) ) 

:L-R-Link  INPLENENTATION 


Cinsarts  -A  in  tha  priority  quaua  -A.'-A- 
An  alaaant's  priority  P  is  highar  than  anothar's  0,  if  P  -A  Q.^k- 
If  an  alaaant  alraa^  axists  in  tha  priority  quaua  with  tha  saaa  • 
priority^  than  tha  naw  ala»ant  is  insartad  into  tha  quaua  aftar  - 
tha  axisting  alawant." 

(INPUT-POirr-NMIE>  (D0C-BP>  (OROBRBO-SPLICE-ZN  1))) 

( ZKPOT-PORT-NAME>  (D0C-DP>  (ORDERED-SPLICE-IN  3) ) } 

(PUNCTION-NRME  (PUNCTION-TYPE 
(PRIORITY-COMPARATOR-INPO  (N>  ORDERED-SFLICfi-IN) )  ) ) ) ) 


(Dafrula  ORDBRED-ASSOC-LIST-INSERT 
'Ordarad  Associativa  List  Insart* 


:RH8-Noda-Typas 

((tVB-UNSAPB- INSERT  .  OROERBD-ASSOC-LI8T-ZI«RRT-UNSAPB) ) 
z  Ii9ut-bd»adding 

(((0RDERBD-A8S0C-LXST- INSERT  1)  (THB-UN8APE- INSERT  1)) 
((ORDBRED-ASSOC-LIST-INSERT  2)  (nB-UNSAPB-INSBRT  2)) 
((OROIRBD-A8SOC-LIST- INSERT  3)  (THB-UNSAPB-INSBRT  3))) 


zOutput  -bbaddi  ng 

(((ORDIRBD-ASSOC-LIST-ZNSERT  4)  (THE-UNSAPB-INSBRT  4))) 
:L-R-Link  ZMPLBfDrrATZON 


Cinsarts  -A  in  tha  ordarad  associativa  list  ~A,  associatad  with  - 
priority  -A.  An  alaaant  X  occurs  bafora  anothar  Y  if  X’s  priority 
-A  T's  priority.* 

(IliPVr-PORT-NANE>  (D0C-BP>  (ORDBRED-ASSOC-LIST-INSERT  1))) 

(iim7r-porr-NMiB>  (doc-bp>  (ordbrbd-assoc-list-insbrt  3))) 

(lNfUT-PORT-NMIB>  (DOC-BP>  (ORDBRED-ASSOC-LIST-INSERT  2)  ) ) 
(PUNCTION-NANE  (PUNCTION-TYPE 

(PRIORITT-CONPARATOR-INPO  (N>  THE-UNSAPE- INSERT) )))) ) 


(Dafrula  0RD1RBD-ASS0C-LI8T-IN8BRT 
'Ordarad  Associativa  List  Insart* 
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:  RI(S-Nod*-1Vp*« 

( (THE-SAFE-ZMSBRT  .  ORDBRSD-ASSOC-LIST-ZNSBRT-SAFB}  } 

:  lr^ut*mb«dding 

( ( (OROERBD-ASSOC-LIST-INSSRT  1)  (THB-SAFE>INSERT  1}} 

( (OROBRBD’ASSOC-LIST-INSBRT  2)  (THB>SAFE- INSERT  2)) 

( (OROBRBO>ASSOC-LIST>INSBRT  3)  (THB-SAFB- INSERT  3))) 

: Output -talMddlng 

( ( (ORDEREO-ASSOC>LIST-INSBRT  4)  (THC-SAFE>INS£RT  4iiJ 

:L-R-Link  IMPLEtCEOTATIOM 

:Doc 

(*ins«rt»  -A  in  th*  ord«r*d  associAtiv*  list  -A.  associAted  - 
with  priority  "A.  An  slwMnt  X  occurs  befors  snothsr  Y  if  - 
X's  priority  -A  Y's  priority.* 

( INPUT-PORT-NAME>  {DOC-BP>  (OROERED>ASSOC-LIST>  INSERT  DH 
(INPUT’PORT-NAME>  (DOC'BP>  (0R0ERED-ASS0C-L1ST>INS£RT  3))} 
(INPOT>PORT-NAKE>  (OOC>BP>  {ORDER BO-A5SOC -LIST-INSERT  2))) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-COMPARATOR-INFO  (N>  THB-SAPE-INSERT) ) ) ) ) ) 

(Defrul*  OROERBD-ASSOC-LIST-INSBRT-SAPE 
*Order«d  Associative  List  Insert  Safe* 

: RHS -Node -Types 

( (B4UNERAT£-PROtn‘  .  ENUM-OAL-FROOT) 

(FIND-TAIL  .  FIND-OAL-TAIL) 

(DO-INSERT  .  OAL-SPLICE-INJ } 

:Bdge-Ll st 

( ( (ENUKERATE-FRONT  3)  .  (DO-INSBRT  1)) 

((FIND-TAIL  3)  .  (DO- INSERT  3))) 

: Input-Enbedding 

( ( (ORDERED-ASSOC-LIST-INSERT-SAFE  1)  (DO-INSERT  2] ) 

( (ORDERED-ASSOC-LIST-INSERT-SAFE  2)  (FIND-TAIL  2)} 

( (ORDERED-ASSOC-LIST-INSERT-SAFE  2)  (ENUMERATE -FRONT  2}) 
((ORDERED-ASSOC-LIST-INSERT-SAFE  3)  (FIND-TAIL  1)] 

( (ORDERED-ASSOC-LIST-INSERT-SAFE  3J  (ENUMERATE -FRONT  1})} 

:  Output -EBKbeddi  ng 

(((ORDERED-ASSOC-LIST-INSERT-SAFE  4)  (DO-INSBRT  4))) 

:L-R-Link  COMPOSITION 
:Doc 

(*  inserts  -A  (associated  w/  priority  -A)  in  the  ordered 
associative  list  -A.  An  element  X  occurs  before  another  Y 
if  X's  priority  -A  Y's  priority. -%- 

If  an  element  alreac^  exists  in  the  list  with  priority  -A,  - 
then  the  new  element  is  inserted  into  the  list  after  the 
existing  element.* 

( rNPUT-PORT-NAME>  (DOC-BP>  (OO-INSERT  2))) 

(INPirr-PORT-MAME>  (OOC-BP>  (QIUNERATE-FROMT  2))) 
(INPUT-PORT-NAME>  (DOC-BP>  (ENUMERATE-FROeTT  1))) 
(FUNCTION-NAME  (FUNCTION-TYPE 
( PR  lOR  ITY  -COMPARATOR-  INFO 
(N>  ORDERED-ASSOC-LIST-INSERT-SAFE)  ) )  ) 

(1NFUT-P0RT-NANE>  (DOC-BP>  (ENUMERATB-FRONT  2))))) 

(Defrule  EMUM-OAL- FRONT 

'Enumerate  Ordered  Associative  List  Front* 

:RHS -Node-Types 
( (COR-OONN  .  GENERATE) 

(HBAO-IN-FRONT?  .  TRUNCATB-OAL-POSITION) 

(THE-HERO-MAP  .  CAR-MAP)) 

: Edge-List 

(  (  (CDR-DOWN  2)  .  (HEAD-IN-FROirr?  1)) 

((HBAD-IN-FRQNT7  3)  .  (THE-HEAD-MAP  1))) 

: Input-Embedding 

( (  (ENUN-OAL-FRGNT  1)  (CDR-OOHN  1)) 

( (EmM-OAL-FRONT  2)  (HEAD- IN-FRONT?  2))) 

:  Output-BRbedding 

(((ENUM-OAL-FRONT  3)  (THE-HEAD-MAP  2) ) ) 

:L-R-Link  COMPOSITION 
:Doc 

( 'enumerates  the  elements  of  the  Ordered  Associative  list  -A  - 
up  to,  but  not  including,  the  element  (if  any)  that  hae 
lowar  priority  than  •'A.  If  there  is  no  such  element,  all  - 
elementa  of  the  list  are  enumerated.* 

(INPUT-PORT-NANB>  (DOC-BP>  (CtA-DOHN  1))) 

(INPVr-PORT-NANB>  (DOC-BP>  (HEAD- IN-FRONT?  2))))) 

(Defrule  FIND-OAL-TAIL 
*Pind  Ordered  Associative  List  Tall* 

:RKS-NOds-1Vpes 
((CDR-DOIIN2  .  GENERATE) 

(NBAD-OF-TAIL?  .  BARLISST-OAL-POSITION) ) 

:  Edge-List 

((ICDR-DCMN2  2)  .  (HEAD-OP -TAIL?  1))) 

:  Input-Embedding 

(((FIND-OAL-TAIL  1)  (CDR-DOIIN2  1)) 

((FIND-OAL-TAIL  2)  (HBAD-OP-TAIL?  2})) 

: Output-Bibeddi ng 

(((FIND-OAL-TAIL  3)  (HBAD-OP-TAIL?  3))) 

:L-R-Link  COMPOSITION 
:Doc 

{'tinds  the  tail  of  -A  (if  any)  whose  head  has  lower  priority 
than  -A.* 

(INPOT-PORT-MANE>  (DOC-BP>  (Cim-00MM2  1))) 

(2NPOT-POflT-NMIB>  (DOC-BP>  (HBAD-OF-TAIL?  2))))) 


(Defrule  EMUM-OAL-PRONT-UMSAPE 

‘Unsafe  Enumerate  Ordered  Associative  List  Front* 

: RHS -Node -Types 
((CDR-DOWM-FRQNT  .  GENERATE) 

(HEAD-BELONG-IN-FRONT?  .  TRUNCATE-OAL-POSITION-UNSAFE) 

(EXTRACT-HEAD  .  CAR -MAP ) ) 

:Edge-List 

(((CDR-DOMN-FRONT  2)  .  (EXTRACT-HEAD  1 ) ) 

( (CDR-DOWN-FRONT  2)  .  (HEAD-BELONG-IN-FRONT?  1))) 

:  1  nput  -  bibeddi  ng 

(  KENUM-OAL-FRONT-UNSAFE  1)  (CDR-DOWN-FRONT  1)} 

((EMIM-OAL-FRONT-UNSAFE  2)  (HEAD-BELONG-IN-FRONT?  2))) 

:Output -Embeddi ng 

(((ENUM-OAL-FRONT -UNSAFE  3)  (EXTRACT-HEAD  2))) 

:L-R-Link  COMPOSITION 
:Doc 

('enumerates  the  elements  of  the  Ordered  Associative  list  -A  up 
but  not  including,  the  eleMnt  (if  any)  that  has  equal  or  lower  - 
priority  than  -A.  If  there  is  no  such  element,  all  elements  of  the  - 
list  are  enumerated.  Priority  equality  is  tested  using  -A  and  the  - 
priorities  are  ordered  by  -A.* 

(INPUT-PORT-NAN£>  (DOC-BP>  (CDR-DOWN-PRONT  1))} 

(INPOT-PORT-MAME>  (DOC-BP>  (HEAD-BELONG-IN-FRONT?  2))) 

(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-EQUALITY-INFO  (N>  DJUM-OAL-FRQNT-UNSAFE)  )  )  ) 

(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-COMPARATOR-INFO  (N>  EWUM-OAL-FRONT -UNSAFE)  ))))  ) 

(Defrule  FIND-OAL-TAIL-UNSAFE 

'Unsafe  Find  Ordered  Associative  List  Tail* 

: RHS-Node -Type S 

I (PREV-CURREMT-SUBLISTS  .  TRAILING-GENERATE) 

(THE-SAFE-EARLIEST  .  EARLIEST-OAL-POSITION) 

(THE-UNSAFE-EARLIEST  .  EARLIEST-EQUAL-PRIORITY-HEADI  ) 

:Edge-List 

(((PREV-CURREMT-SUBLISTS  2)  .  (THE-UNSAFE-EARLIEST  1 )  ) 
((PREV-CURRBFT-SUBLISTS  2)  .  (THE-SAFE-ERRLIEST  1))) 

:  Input-Babedding 

(((FIND-OAL-TAIL-UNSAFE  1)  (PREV-CURRD4T-5UBLISTS  1)) 
((FIND-OAL-TAIL-UNSAFE  2)  (THE-UNSAFE-EARLIEST  2)} 
((FIND-OAL-TAIL-UNSAFE  2)  (niE-SAFE-EARLI£5T  2))) 

: Output -tebeddi ng 

(((FIND-OAL-TAIL-UNSAFE  3)  (PREV-CURREMT-SUBLISTS  3)) 
((FXND-OAL-TAIL-UMSAFE  3)  (THE-SAFE-EARLIEST  3))) 

:L-R-Link  COMPOSITION 
:Doc 

('finds  the  tail  of  -A  (if  any)  %irt)ose  head  has  equal  or  lo«M»r  priority 
than  •'A.  Priority  equality  is  tested  using  -A  and  the  priorities  - 
are  ordered  by  •'A.* 

(INPUT-PORT-NAME>  (DOC-BP>  (PREV-CURRENT-SUBLISTS  1))) 
(INPUr-PORT-NANE>  (DOC-BP>  (THE-SAFE-EARLIEST  2))) 

(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-EQUALITY-INFO  (N>  FIND-OAL-TAIL-UNSAFE) ) )  ) 

(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-COMPARATOR-INFO  (N>  FIND-OAL-TAIL-UNSAFE)))))) 

(Defrule  ORDERED-ASSOC-LIST-DELETB 
‘Ordered  Associative  List  Delete* 

:RH8-Node-lVpes 

( (UNSAPE-FRONT-ENUMERATION  .  EMOI-OAL-FROirr-UNSAFE) 

(UN8AFB-TA1L-SBARCH  .  FIND-OAL-TAIL-UNSAFE) 

(CONS-UP-RBNAINING  .  CONS-ACCUMULATE-UP-FRON-SUBLIST) ) 

:&3ge-List 

(({UNSAFB-FRONT-ENUNERATION  3)  .  (C0NS-UP-RS(AININ6  1 )  ) 

( (UNSAFB-TA2L-SEARCH  3)  .  (CONS-UP-REMAINING  2}}) 

: Input -Embeddi ng 

( ( (mDERBD-ASSOC-LIST-DELBTE  2)  (UNSAFE-TAIL-SEARCH  1)) 
((0RDBRED-A550C-LX8T-DELBTE  2}  (UNSAFB-FRONT-ENUMBRATXON  1)) 
((ORDBRED-ASSOC-LIST-DELErrE  1)  (UNSAFE-TAIL-SBARCH  2)) 
((ORDBRBD-A5SOC-L18T-DBLSTB  1)  (UNSAFE-FRONT-SAMERATION  2})) 

:Output -tabeddi ng 

(((ORDERED-ASSOC-LIST-DELETB  3)  (CONS-UP-REMAINING  3))) 

:L-R-Link  COMPOSITION 
:DOC 

('deletes  the  element  essociated  w/  priority  -A  fr^  the  ordered  > 
associative  list  -A.-%- 

The  predicate  used  to  test  for  priority  equality  is  -A.-t- 

If  thsre  is  more  than  1  element  with  this  priority,  only  the  first  •> 

is  removed.  An  element  X  occurs  before  another  T  if  X's  priority  > 

•A  Y's  priority.* 

(lNPVr-PORT-NANB>  (DOC-BP>  (UNSAFE-FRONT-ENUMERATION  2))) 
(INPUr-POirr-NAIIE>  (D0C-BP>  (UNSAFE-FRONT-ENUNERATION  1))) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-EQUALITY-INFO  (N>  OROBRED-ASSOC-LIST-DELETE) ) ) ) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-COMPARATOR-INFO  (N>  ORDERBD-ASSOC-LIST-DBLETE) ) ) ) ) ) 


(Defrule  ORDBRED-ASSOC-LIST-INSBRT-UNSAPB 
'Unsafe  ordered  Associative  List  Insert* 

:RKS -Node-Types 

( (VEMBRATE-FRCNT-UNSAFELT  .  BNUN-OAL-FRONT-UNSAPB) 
(PIND-TAIL-UNSAPELY  .  FIND-OAL-TAIL-UNSAFE) 
(THB-1N8BRTI0N  .  OAL-SPLICE-IN) ) 


:Edg«-Li8t 

(( (ENUMERATE- FROMT-UNSAFELY  3)  .  (TME-INSERTION  IJ) 
( (FIND-TAIL-UNSAPBLY  3)  .  (THE- INSERTION  3) ) ) 


:  lnput-EBib«dding 

( ( (OROBRED-ASSOC-LIST-INSBRT-UNSAFE  1 ) 
(  (OROERED-ASSOC-LIST-INSERT-UNSAFE  2) 
( {OROERBO-ASSOC-LZST-INSERT-CMSAFE  2} 
(ENUNERATE-FRONT-UNSAPELY  2)) 

{ (0R0ERED-A5S0C-LIST-INSERT-UNSAFE  3) 
(FIND-TAIL-UNSAFELY  1)) 

( (OROERED-ASSOC-LIST-INSERT-UNSAPE  3) 
( ENUMERATE- FRONT-UNSAFELY  IM) 
:Output -Enb«dding 

( ( (OROBREO-ASSOC-LIST-INSERT-UNSAFE  4) 
:L-R-Link  COMPOSITION 


(THE-INSERTION  2)) 
(FIND-TAIL-UNSAFELY  2)) 


(THE- INSERTION  4))) 


:Doc 

(■insert*  -A  (sssociatsd  w/  priority  -A)  in  the  ordsrsd  - 
associativs  list  ~A.  The  insartion  is  unsafe  in  that  if  « 
there  is  an  existing  eleaent  in  the  list  that  has  priority  - 
"A  "A.  then  that  existing  eleaent  is  replaced  by  ~A.«>%« 

An  element  X  occurs  before  another  Y  if  X's  priority  «A  Y's  - 
priority. • 

( INPUT-PORT-NAME>  (DOC-BP>  (THE- INSERTION  2))) 
(INPOT-PORT-NAME>  (OOC-BP>  (BNUNERATE-FRONT-UNSAPELY  2)}) 
(INPUT-PORT-NANE>  (DOC-BP>  (ENUNERATB-PRONT-UNSAFELY  1))) 
(FUNCTION-NAME  (FUNCTION-TYPE 
( PR  lOR  ITY-BQUALITY-INFO 
(N>  ORDERED-ASSOC-LIST-INSBRT-UNSAFE)))) 

(INPOT-PORT-NAME>  (DOC-BP>  (ENUMERATE-FRONT-UNSAFELY  2))) 
(INPUT-PORT-NAKE>  (DOC-BP>  (THE-INSERTION  2))) 

(FUNCTION-NAME  (FUNCTION-TYPE 
( PR  I  OR  ITY -COMPARATOR  -  INFO 
(N>  OROERED-ASSOC-LIST- INSERT-UNSAFE) ) ) J ) ) 


(Defrule  OAL-RETRIEVE-IF-BXtSTS 

■Ordered-Associative  List  Retrieve  (If  Exists)* 

: RHS -Node -TVP«* 

((ENUM-OAL  .  OROERBD-ASSOC-LE) 

(EARLIEST-ELEMENT  .  BARLIEST-BQUAL-PRIORITY)  ) 

:Edge-List 

( ( (ENUM-OAL  3)  .  (BARLIEST-BLEMENT  1) ) ) 

: I npu t - Eabedd i ng 
( ( (OAL-RETRIEVE-IF-EXISTS  1) 

( (OAL-RETRIEVE-IF-EXZSTS  1} 

((OAL-RETRlEVE-IF-EXlSTS  2) 

: Output - Eabeddi ng 

( ( (QAL-RETRIEVE-IF-EXISTS  4)  (EARLIEST-ELENErrr  3))) 
;St-Thrus 

(((OAL-RETRIEVE-IF-EXISTS  3)  (OAL-RETRIEVE-IF-EXISTS  4))) 

:L-R-Lin)l  COMPOSITION 

:Doc 

(• intermediate  non-terainal :  Ordered-Assoc-List-Lookup. *) } 


(EARLIEST-ELEMElFr  2)) 
(ENUM-OAL  2)) 
(ENUM-OAL  1))) 


(Defrule  OROBRBO-ASSOC-LIST-LOOKUP 
■Ordered  Associative  List  Lookup* 

:  RHS-Node-TVP«s 

( (THE-RBTRIEVAL  .  OAL-RETRIEVE-IF-EXISTS)) 

: input-Babedding 

( ( (OROERBO-ASSOC-LIST-LOOKUP  1)  (THE-RETRIEVAL  1)) 

( (ORDERBD-ASSOC-LIST-LOCHCUP  2)  (THE-RETRIEVAL  2))) 

: output -Embedding 

( ( (OROERED-ASSOC-LIST-LO^tUP  3 )  (THE-RETRIEVAL  4) ) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(■finds  and  returns  the  element  associated  w/  priority  -A  in 
the  ordered  associative  list  -'A.-'%-' 

If  no  element  with  priority  -A  is  found.  NIL  is  returned. 
The  predicate  used  to  test  for  priority  equality  is 
If  there  is  more  than  1  element  with  this  priority,  only  - 
the  first  is  retrieved.  An  elesient  X  occurs  before  another 
y  if  X's  priority  -A  Y's  priority.* 

(INPUT-PORT-NAME>  (DOC-BP>  (CXtOERED-ASSOC-LIST-LOOKUP  1))) 
(1NPOT-P0RT-NANE>  (DOC-BP>  (ORDERED-ASSOC-LIST-LOOKUP  2))) 
(INPOT-PORT-NANB>  {DOC-BP>  (ORDERED-ASSOC-LIST-LOOKUP  1))) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(PRlORlTY-EQUALITy-INFO  (N>  ORDERED-ASSOC-LIST-LOOKUP)  )  )  ) 
(FUNCTION-NAME  (FUNCTION-TYPE 
( PRIORITY -COMPARATOR-INFO 
(N>  OROERBD-ASSOC-LIST-LOMCUP}})))) 


(Defrule  ORDERED-ASSOC-LE 

■Ordered  Associative  List  Enumeration* 
:RHS-Node-Types 

( (THB-ORDERBD-ASSOC-SLE  .  ORDERED- ASSOC -SLE) 
(EACH-ELEMENT  .  CAR -MAP ) ) 

: Edge-List 

( ( (TNE-0RDSRSD-AS80C-SLE  3)  .  (SACH-ELSMENT  1)}) 

:  ir^t-Babedding 

(((ORDERED-ASSOC-LE  1)  (THB-0RDSRED-AS80C-8LE  1)) 
((ORDERED-ASSOC-LE  2)  (THE-OROERED-ASSOC-SLE  2))) 
:  Output -Imbeddi  ng 

( ( (ORDERED-ASSOC-LE  3)  (EACH-ELAlDFr  2) ) ) 

:L-R-Link  COMPOSITION 


:Ooc 

(■enumerates  the  elements  of  -A.  up  to.  but  not  including. 
the  element  that  has  loiter  priority  than  -A.* 

(lNPtFr-PORT-NAME>  (DOC-BP>  (ORDERED-ASSOC-LE  1))) 

(XNPUT-PORT-NAM£>  (DOC-BP>  (ORDERED-ASSOC-LE  2 })))  ) 

(Defrule  ORDERED-ASSOC-SLE 

■Ordered  Associative  Sublist  Enumeration* 

:  RHS-Node-Types 
( (QAL-OENBRATE  .  GENERATE) 

(OAL-TRUNCATE  .  TRUNCATE-OAL-POSITION) ) 

:Edge-List 

( ( (OAL-GEMERATE  2)  .  (OAL-TRUNCATE  1))) 

:  Inj^t-ttUiedding 

(((ORDERED-ASSOC-SLE  1)  (OAL-GEMERATE  1)) 

((ORDERED-ASSOC-SLE  2)  (OAL-TRUNCATE  2) ) ) 

:  Output  -  bibedd  i  ng 

(((ORDERED-ASSOC-SLE  3)  (OAL-TRUNCATE  3 )) ) 

:L-R-Link  COMPOSITION 
:Doc 

(■enumerates  the  successive  sublists  of  -A,  up  to.  but  not  including. 
the  sublist  with  a  head  that  has  lower  priority  than  ~A.* 
(lNPUT-PORT-HANE>  (DOC-BP>  (ORDERED-ASSOC-SLE  1))) 

(INPUT-PORT-NAME>  (DOC-BP>  [ORDERED-ASSOC-SLE  2))))) 


(Defrule  LIST-PUSH 
■List  Push^ 

: RMS -Node -Type s 
( (THE-CONS  .  CONS)  ) 

:  Input-Embedding 
(((LIST-PUSH  1)  (THE-CONS  1)) 

((LIST-PUSH  2)  (THE-CONS  2))) 
lOutput -&Ll»eddi  ng 
(((LIST-PUSH  3)  (THE-CONS  3))) 

:L-R-Link  IMPLEMENTATION 
:DOC 

(■pushes  -A  onto  the  list  -A.* 

(INFOT-PORT-NAMB>  (DOC-BP>  (LIST-PUSH  1)}) 

( INPUT- PORT-NAME>  (DOC-BP>  (LIST-PUSH  2))))) 

(Defrule  oal-splice-out 
■Splice  out  vf  Ordered  Associative  List* 

:  RMS -Node-Types 
((POP-TAIL  .  CDR) 

(ADD-FRONT  .  CONS-ACCUNULATB-UP-FROM-SUBLIST) ) 

:Bdge-List 

(((POP-TAIL  2)  .  (AOD-FRONT  2) ) ) 

$  Input -Bmbeddi ng 

( ( (OAL-SPLICE-OUT  1)  (ADD-FRONT  1)) 

((OAL-SPLICE-OUT  2)  (POP-TAIL  1))) 
lOutput-nibedding 

(((OAL-SPLICE-OUT  3)  (ADD-FRONT  3))) 

:L-R-Link  COMPOSITION 
:Doc 

(■splices  the  heed  of  the  -A  out  of  the  ordered  essociative  li8t-%- 
that  contains  it  as  a  tail.* 

(INPUT-PORT-NANE>  (DOC-BP>  (POP-TAIL  1)}))) 

(Defrule  OAL-SPLICE-IN 

■Ordered  Associative  List  ^lice  In* 

:  RH5 -Node -Types 
((PUSM-ONTO-TAIL  .  LIST-PUSH) 

(CONS-UP-FRONT  .  CONS-ACCONULATB-UP-FROM-SUBLIST)  ) 

: Edge-List 

(((PUSH-ONTO-TAIL  3)  .  (CONS-UP-FRONT  2) ) ) 

:  Input-tabedding 

(((OAL-SPLICB-IN  1)  (CONS-UP-FRONT  1)) 

((OAL-SPLICE-IN  2)  (FUSH-ONTO-TAIL  1)) 

((OAL-SPLICB-IN  3)  (PUSH-ONTO-TAIL  2) ) ) 

:  Output  -  ttibeddi  ng 

(((OAL-SPLICE-IN  4)  (CONS-UP-FRONT  3))} 

;L-R-Link  COMPOSITION 
:Doc 

(■splices  -A  in  between  the  front  of  the  list  -A  and  the  tail  -A.* 
(INPUr-PORT-NAME>  (DOC-BP>  (FUSH-ONTO-TAlL  1)}) 

(IlliVT-PORT-NAMB>  (DOC-BP>  (CONS-UP-FRONT  1))) 

(IIfPUT-PORT-NAME>  (DOC-BP>  (PUSH-ONTO-TAIL  2))))) 


(Defrule  1RUNCATB-OAL-PO81TI0N-UM8APS 
■Unsafe  Truncate  at  Priority  Position* 

:  RMS-Node-lVP** 

( (TKB-8AFB-TRUNCATB  .  TRUNCATE-OAL-POSITION) 
(THE-UN8AFE-TRUNCATE  .  HtUNCATE-lQUAL-PRIORITT-HEAD) ) 


:Edge-List 

( ( (THE-SAFB-TRUNCATB  3)  .  (TNB-UN8AFE-TRUNCATE  1))) 

:  li^t-tabedding 

( ( (TRUNCATB-OAL-POSITION-UNSAPB  1) 

( (TRUNCATE-OAL-PO8ITI0N-UNSAPE  2) 

( (TRUNCATB-OAL-POSXTION-UNSAPE  2} 
xOutput-ttibedding 

( ( (TRUNCATB-OAL-POSITIQN-UNSAPE  3)  (THB-UN8APB-TRUNCATB  3))) 
iL-R-Llnk  COMPOSITION 


(THE-SAFB-TRUNCATB  1)) 
(YMB-UNSAFE-TRUNCATE  2)) 
(THB-8AFE-TRUNCATB  2) ) ) 


:DoC 
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(■outputs  ths  slsMnts  of  the  input  series  (eech  elt.  is  en  « 
ordered  essocietive  list),  « 

up  to  but  not  including  the  one  thet  is  eo^ty  or  has  a  head  - 
with  priority  less  than  or  equal  to  •'A.-* 

A  priority  P  is  less  than  another  Q  if  P  ~A  Q.- 
A  priority  P  is  equal  to  another  Q  if  P  "A  Q.* 
(INPirr-PORT-NAME>  (DOC>BP>  (THE-SAFE-TRUNCATE  2)  )  ) 
(FUNCT10N>NAME  (PUNCTION’TYPE 

IPRIORITY-COMPARATOR-INFO  (N>  THE-SAFE-TRUNCATE) ) ) ) 
(FUNCTION-NAME  (FUNCTION-TYPE 

(PRIORITY-EQUALITY-INFO  (N>  THE-UNSAFE-TRUNCATE) ) ) } ) ) 

(Defrule  TRUNCATE-EQUAL-PRIORITY -HEAD 
•Truncate  Equal  Priority  Head* 

: RHS-Node-Types 

( (PH-EQUALtTY-TEST  .  EQUAL-PRIORITY-MEAD) ) 

: Input-Embedding 

(((TRUNCATE-EQUAL-PRIORITY-HEAD  1)  (PH-EQUALITY-TBST  U) 

( (TRUNCATE-EQUAL-PRIORITY-HEAD  2)  (PH-EQUALITY-TEST  2))) 
:St-Thru8 

( ( (TRUNCATE-EQUAL-PRtORITY-HEAD  1 ) 

(TRUNCATE-BQOAL-PRIORITY-HEAD  3J)i 
iL-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

(■outputs  the  elements  of  the  input  series  (each  elt.  is  an  ~ 
associative  list),  up  to  but  not  including  the  one  that  is  - 
empty  or  has  a  head  with  lo%Mr  priority  than  ~A.* 

( INPUT -PORT -NAME>  (DOC-BP>  (PH-BQUALITY-TEST  2) ) ) ) ) 

(Defrule  EARLIEST-EQUAL-PRIORITY-KCAO 
•Earliest  Equal  Priority  Head* 

: RHS-Node-Types 

( (EQUAL-PH-SEARCH  .  EQUAL-PRIORITY-HEAD)) 

? Input -Embedding 

( ( (EARLIEST-EQUAL-PRIORITY-HEAD  1)  (EQUAL-PH-SEARCH  1)) 
((EARLIEST-EQUAL-PRIORITY-HEAD  2)  (EQUAL-PH-SEARCH  2})) 
iSt-Ttirus 

(((EARLIEST-EQUAL-PRIORITY-HEAD  1) 

(EARLIEST-EQUAL-PRIORITY-HEAD  3])) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

(■outputs  the  first  element  of  the  input  series  (each  elt.  is  >- 
an  ordered  associative  list),  that  has  a  head  with  > 
priority  -A,* 

(INPt7r-PORT-NAME>  (DOC-BP>  (EARLIEST-EQUAL-PRIORITY-HEAD  2) ))) ) 

(Defrule  EQUAL-PRIORITY-HEAD 
•Equal  Priority  Head* 

: RHS-Node-Types 
((ACCESS-HEAD  .  CAR) 

(CHECK-PRIORITIES  .  EQUAL-PRIORITY -TEST) ) 

:Edge-List 

(((ACCESS-HEAD  2}  .  (CHECK-PRIORITIES  2) ) ) 

:  Znput-Otbedding 

( ( (EQUAL-PRIORITY-HBAD  1)  (ACCESS-HEAD  1}) 

((EQUAL-PRIORITY-HEAD  2)  (CHECK-PRIORITIES  1))) 

:L-R-Lin)(  COMPOSITION 
:Doc 

(■tests  %«hether  the  head  of  the  input  associative  list  -A  has  - 
priority  -A.* 

(1NPUT-P0RT-NAME>  (OOC-BP>  (ACCESS-HEAD  1))) 

(INPtFr-PORT-NAME>  (OOC-BF>  (CHECK-PRIORITIES  1))))) 

(Defrule  TRUNCATE-EQUAL-PRIORITY 
■Truncate  Equal  Priority* 

: RHS-Node-Types 

(  (PRIORITY-EQUALITY -TEST  .  EQUAL -PRIORITY -TEST) ) 

:  Input-Oibedding 

( ( (TRUNCATS-EQUAL-PRIORITY  1)  (PRIORITY -EQUALITY -TEST  2)) 
((TRUNCATE-EQUAL-PRIORITY  2)  (PRIORITY-EQUALITY -TEST  1))) 

:St -Thrus 

(((TRUNCATE-EQUAL-PRIORITY  1)  (TRUNCATE-EQUAL-  VIORITY  3)i) 

:L-R-LinlC  TEMPORAL-ABSTRACTION 

:Doc 

(■outputs  the  elements  of  the  input  series,-' 
up  to  but  not  including  the  one  that  has  lower  priority 
than  -A.* 

( 1NPUT-P0RT-NAME>  (DOC-BP>  (PRIORITY-EQUALITY -TEST  1)))}) 

(Defrule  TRUNCATE-EQUAL-PRIORITY 
•Truncate  Equal  Priority* 

:  RHS-Node -Types 

((PRIORITY-BQUALITY-TEST  .  EQUAL-PRIORITY-TEST) ) 

:  input-bibedding 

(((TRUNCATE-EQUAL-PRIORITY  1)  (PRIORITY-EQUALITY-TBST  2)) 
((TRUNCATE-EQUAL-PRIORITY  2)  (PRIORITY-EQUALITY -TEST  2))) 

: St -Thru# 

(((TRUNCATE-EQUAL-PRIORITY  1)  (TRUNCATE-EQUAL-PRIORITY  3))) 

:L-R-Lin)c  TEMPORAL-ABSTRACTION 

:Doc 

(•outputs  the  elemente  of  the  input  series,  up  to  but  not  - 
including  the  one  that  has  lower  priority  than  -A.* 
(1NPUT-P0RT-NAMB>  (OOC-BP>  (PRIORITY-EQUALITY -TEST  2)  ))) ) 


(Defrule  EARLIEST-EQUAL-PRIORITY 
•Earliest  Equal  Priority* 

: RHS-Node -Type  a 

( (EQUAL- P- SEARCH  .  EQUAL-PRIORITY-TEST)) 

:  Input  -Bsibeddi  ng 

(((EARLIEST-EQUAL-PRIORITY  1)  (EQUAL-P-SEARCH  2}) 
((EARLIEST-EQUAL-PRIORITY  2)  (EQUAL-P-SEKRCH  1))) 

:St -Thrus 

(((EARLIEST-EQUAL-PRIORITY  1)  (EARLIEST-EQUAL-PRIORITY  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

(■outputs  the  first  element  of  the  input  series- 
>fcthat  has  priority  -A.* 

(INPUr-PORT-NAME>  (DOC-BP>  (EQUAL-P-SEARCH  1))))) 

(Defrule  EARLIEST-EQUAL-PRIORITY 
•Earliest  Equal  Priority* 
iRHS-Node-Types 

((EQUAL-P-SEARCH  .  EQUAL-PRIORITY-TEST) ) 

: Input-Embedding 

(((EARLIEST-EQUAL-PRIORITY  1)  (EQUAL-P-SEARCH  1)) 
((EARLIEST-EQUAL-PRIORITY  2)  (EQUAL-P-SEARCH  2))) 

:St -Thrus 

(((EARLIEST-EQUAL-PRIORITY  1)  (EARLIEST-EQUAL-PRIORITY  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:I>oc 

(■outputs  the  first  element  of  the  input  series- 
~4that  has  priority  -A.* 

(INPUT-PORT-NAME>  (DOC-BP>  (EQUAL-P-SEARCH  2) ))) ) 

(Defrule  EQUAL-PRIORITY-TEST 
■Equal  Priority  Test* 

: RHS -Node -Type 8 

((EQUAL-PRIORITIES  .  COMMUTATIVE-BINARY-FUNCTION) 

(THE-TEST  .  NULL-TEST)  ) 

:Edge-Liat 

(((EQUAL-PRIORITIES  3)  .  (THE-TEST  IJ)) 

: Input-Embedding 

(((EQUAL-PRIORITY-TEST  1)  (EQUAL-PRIORITIES  1)) 

((EQUAL-PRIORITY-TEST  2)  (EQUAL-PRIORITIES  2))) 

:L-R-Link  COMPOSITION 
:Doc 

(•tests  idiether  -A  and  -A  have  -A  priorities.* 

(IMPUT-PORT-NAME>  (DOC-BP>  (EQUAL-PRIORITY-TEST  1))) 
(INPUT-PORT-NAME>  (DOC-BP>  (EQUAL-PRIORITY-TEST  2))) 
(EQUALITY-PREDICATE?  (N>  EQUAL-PRIORITY-TEST) )  )  ) 

(Defrule  TRUNCATE-OAL-POSmON 
■Truncate  at  Priority  Position* 

:  RKS-Node -TyP«» 

((POSITION-TEST  .  EMPTY-OR-LOW-PRIORITY-HEAD)  ) 

:  Input  -ttJbedding 

(((TRUNCATB-OAL-POSITION  1)  (POSITION-TEST  1)) 

( (TRUNCATE-OAL-POSITION  2)  (POSITION-TEST  2))) 

: St -Thrus 

(((TRUNCATE-OAL-POSITION  1)  (TRUNCATE-OAL-POSITION  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:l>oc 

(•outputs  the  elements  of  the  input  series  (each  elt.  is  an  - 
ordered  associative  list),  > 

-'4up  Co  but  not  including  the  one  chat  is  empty  or  has  a  head  - 
-^wich  lower  priority  than  -A.* 

(INPUT-PORT-NAME>  (DOC-BP>  (POSITION-TEST  2) ))) ) 

(Defrule  EARLIEST-OAL-POSITION 
■Earliest  Priority  Position* 

:RMS-Node-7VP** 

( (OAL-POS1TION-5EARCH  .  EMPTY -OR-LOW-PRIORITY -HEAD)  ) 

:lnput-  aObedding 

(((EARLIEST-OAL-POSITION  1)  (OAL-POSITION- SEARCH  1)) 
((EARLIEST-OAL-POSITION  2)  (OAL-POSITION-SEARCH  2))) 

.‘St-YYirus 

(((EARLIEST-OAL-POSITION  1)  (EARLIEST-OAL-POSITION  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

(■outputs  the  first  element  of  the  input  series  (each  elt.  is  an  > 
ordered  associative  list),* 

-’Sthet  is  either  empty  or  has  a  head  with  lower  priority  than  -A.* 
(INPOT*PORT-NAME>  (DOC-BP>  (EARLIEST-OAL-POSITION  2))))) 

(Defrule  EMPTY-OR-LOW-PRIORITy-NEAD 
■apty  or  Low  Priority  Head* 

:  RHS-Node-iyP** 

((EMPTY?  .  NULL) 

(COrPTROL-CONPARlSON  .  NULL-TEST) 

(aBT-KBAD  .  CAR) 

(CGNPARE-PRIORITXBS  .  ANT -COMPARATOR) 

(OR-TEST  .  NULL-TEST)) 

:ldga-List 

(((BIPTY7  2)  .  (OR-TEST  U) 

((atPTT?  2}  .  (CONTROL-COMPARISON  1)) 

((OBT-HBAD  2}  .  (CONPARB-PRtORITXSS  2)) 

((CQMPARB-PRXORITXES  3)  .  (OR-TEST  1))) 
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:  Inpuc-Bnb^dding 

((  (EMPTY -OR-LOW-PRIORITY -HEAD  1) 

(GET-HEAD  1) ) 

(  (EMPTY-OR-liOW-PRIORITY-HEAD  1)  (EMPTY?  1)) 

( (EMPTY-OR-LOW-PRIORITY-HEAD  2)  (COMPARE-PRIORITIES  1))) 
:L-R-Link  COMPOSITION 
:Doc 

(*t«8ts  whether  the  list  -A  is  either  empty  or  has  a  first 
element  that  has  a  lower  priority  than  "A.* 

(INPUT-PORT-NAME>  (DOC-BP>  (OCPTY-OR-LOW-PRIORITY-HEAD  1))) 

( INPOT -PORT -NAME>  (DOC-BP>  (EMPTY-OR-LOW-PRIORITY-HEAD  2) } ) ) ) 

(Defrule  ORDERBD-ASSOC-LIST-EXTRACT 
'Ordered  Associative  List  Extract* 

: RHS -Node -Types 
((THE-POP  .  LIST-POP)) 

: Input-Embedding 

( ( (ORDERED-ASSOC-LtST-EXTRACT  1)  (THE-POP  1))} 

:  Ou  t  pu  t -Embedd  i  ng 

(((ORDERBD-ASSOC-LIST-EXTRACT  2)  (THE-POP  2)) 

( (ORDERED-ASSOC-LIST-BXTRACT  3)  (THE-POP  3))) 

:L-R-Lin)c  IMPLEMENTATION 
:0oc 

('extracts  the  highest  priority  element  from  the  ordered  ~ 
associative  list  •A  by  popping  the  first  element.* 
{INPOT-PORT-NAME>  (DOC-BP>  (THE-POP  1))))) 

(Defrule  LIST-POP 
'List  Pop' 

: RHS-Node-Types 
( (POLL-OPF-HEAD  .  CAR) 

(GET-TAIL  .  CDR) ) 

: Input-Qnbedding 
(((LIST-POP  1)  (GBT-TAIL  1)) 

((LIST-POP  1)  (PULL-OFP-HEAD  1))) 

: Output -Bmbeddi ng 

(( (LIST-POP  2)  (POLL-OFF-HEAD  2)) 

((LIST-POP  3)  (GET-TAIL  2))) 

:L-R-Lin)c  COMPOSITION 
:Doc 

('pops  the  first  element  off  of  the  list  >A.* 

(lNPOT-PORT-NAME>  (00C-8P>  (GET-TAIL  1))))) 

(Defrule  ACCUMULATION-UP 
'Accumulation  Up* 

: RHS-Node-Types 
(  (ACCUM-FUNCTION  .  ANY-BIN-F) ) 

: I npu t - Embedd i ng 

l((ACCUKULATXON-UP  2)  (ACCUM-FUNCTION  1))) 

: Ou t pu t - Embedd i ng 

(((ACCUMULATION-UP  3)  (ACCUM-FUNCTION  3))) 
rSt-Thrus 

(  (  (ACOEIULATION-UP  1)  (ACCUMULATION-UP  3) ) ) 

;L-R-Lin)c  COMPOSITION 
:Doc 

('iteratively  applies  the  function  -A  to  the  result  of  the  - 
recursive  call  and  a  new  value.  The  result  of  the  af^lication 

is  returned  as  the  result  of  the  recursive  call.* 
(FUNCTION-TYPE  (FUNCTION-INFO  (N>  ACCUM-FUNCTION) ))) ) 

(Defrule  ACCUMULATE-UP 

'Accumulate  on  the  way  up* 

: RHS-Node-Types 

( (ITBR-ACCUM-UF  .  ACCUMULATION-UP)) 

: Input-Embedding 

(  ( (ACCUMULATB-UP  1 )  ( ITER-ACCUN-UP  1 ) ) 

((ACCUMULATB-UP  2)  (ITER-ACCUN-UP  2) ) } 

:  Output -tebeddi  ng 

( ( (ACCUNULATE-UP  3 )  ( ITER-ACCUN-UP  3 ) ) ) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

('accumulates  the  values  of  the  input  series  'on  the  way  up'  ^ 
using  the  function  -A.  The  initial  value  of  the  accumulation 
is  ~A.* 

(FUNCTION-TYPE  (FUNCTION- INFO  (N>  ITER-ACCUM-UP) ) ) 

(INIT-VALUE  (N>  ITER-ACCUN-UP)))) 

(Defrule  CONS-ACCUMULATE-UP 
'Cons  Accumulate  on  the  way  up* 

:RH8-Node-Types 

(  (THB-UP-ACCUM  .  ACCUMULATB-UP) ) 

: Input-Embedding 

(  ( (CONS-ACCUMULATB-UP  1)  (THB-UP-ACCUM  2))) 

:  Out  pu  t  -  Embedd  i  ng 

(((CONS-ACCUMULATB-UP  2)  (THB-UP-ACCUM  3))) 

:L-R-Link  INPLDCBfrATION 
:0oc 

('accumulates  the  elements  of  -’A  into  a  list  using  cons.* 
(INPOT-PORT-NANB>  (DOC-BP>  (CONS-ACCUMULATE-UP  1))))) 

(Defrule  CONS-ACCUNULATB-UP-FRON-SUBLIST 

'Cons  Accumulate  on  the  way  up  from  Sublist* 

: RHS-Node-Types 


((THB-UP-ACCUM  .  ACCUMULATE-UP) J 
: 1 nput - Embedd 1 ng 

( (ICONS-ACCUMULATE-UP-FROM-SUBLIST  1)  (THE-UP-ACCUM  2)) 

( (CONS-ACCUMULATE-UP-FROM-SUBLIST  2)  (THE-UP-ACCUM  1))1 
:  Output  -  Esibedd  1  ng 

KICONS-ACCUMULATE-UP-FROM-SUBLIST  3)  (THE-UP-ACCUM  3))) 

:L-R-Link  IMPLEME^rTATION 
:Ooc 

('accumulates  the  elements  of  -A  into  a  list  irfhose  tail  is  -A.* 
(IHPUT-PORT-NANE>  (DOC-BP>  (CONS-ACCUMULATE-UP-FROM-SUBLI ST  1))) 
(INPOT-PORT-NAME>  (DOC-BP>  (CONS-ACCUMULATE-UP-FROM-SUBLIST  2 )  )  )  )  ] 

(Defrule  LIST-EMPTY 
'List  Empty* 

: RH S - Node -Type B 

((THE-NULL  .  TEST-PREDICATE)) 

:  1  n^t  -  Embedd  i  ng 
(((LIST-EMPTY  1)  (THE-NULL  1))) 

:L-R-Link  IMPLEMENTATION 
:Doc 

('checks  whether  the  list  -A  is  ei^ty.' 

(lNPUT-PORT-NAME>  (OOC-BP>  (LIST-EMPTY  1))))) 

; ; ;  Figure  4-14 . 

(Defrule  GOIERATION 
'Generation* 

:  RHS -Node -IVpa  a 

((GEK-FUNCTION  .  ANY-GEM-F)) 

:  Input -Embedding 

(((GENERATION  1)  (GEM-FUNCTION  1))) 
sSt-'^rus 

(((GENERATION  1)  (GENERATION  2))) 

:L-R-Link  COMPOSITION 
:DOC 

('generates  the  successive  elements  of  -A  by  repeatedly  applying  the 
function  -A  to  the  result  of  its  preceding  application.* 
(INPOT-PORT-NAME>  (DOC-BP>  (GENERATION  1))} 

(FUNCTION-TYPE  (FUNCTION- INFO  (N>  GEN-FUNCTION))))) 

(Defrule  GENERATE 
■Generate' 

: RHS  ^4ode -Type s 
((THE-COUNT  .  COUNT)) 

:  Input -OdMdding 
(((GSIERATE  1)  (THE-COUNT  1))) 

:  Output  -  Emlieddi  ng 
(((GENERATE  2)  (THE-COUNT  2) ) ) 

:L-R-Link  IMPLEMENTATION 
;Doc 

('generates  the  elements  of  -A  by  counting  them.* 

(ZNPOT-PORT-NANE>  (DOC-BP>  (GENERATE  1))))) 

(Defrule  GENERATE 
'Generate* 

:  RHS -Node -Typea 
((ZTSR-GSI  .  GDIERATION)) 

:  Input-aibedding 
(((GBIERATE  1)  (ITER-GS4  1))) 

:  Output  -  Embedd  i  ng 
(((GENERATE  2)  (ZTBR-GEN  2))) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

('generates  a  series  of  elements  of  -A  by  repeatedly  applying  the  > 
function  -A.' 

(INFOT-PCH1T-NANE>  (DOC-BP>  (GENERATE  1))) 

(FUNCTION-TYPE  (FUNCTION- INFO  (N>  ITER-GEN))))) 

(E>efrule  COMWTATZVE-BINARY-FUNCTION 
*Cosmnitat«ve  Binary  Function* 

:  RHS -Node -Type  s 

(  (COMf-BIN-FUNCTION  .  ANY-COMM-BIN-F) ) 

:  Input-bbedding 

( ( (COIMOTATIVE-BINARY-FUNCTJON  I)  (COMf-BIN-FUNCTION  2) ) 

( (CGMfUTATIVB-BINARY-FUNCTION  2)  (COMf-BIN-FUNCTION  1))) 
:Output-BRbeddi  ng 

( ( (COMIOTATIVB-BINARY-FUNCTZON  3)  (COMf-BIN-FUNCTION  3))) 

;L-R-Link  INPLEMBFrATlOH 
:Doc 

('applies  the  commutative  binary  function  -A.* 

(FUNCTION-TYPE  (FUNCTION-INFO  (N>  COMf-BIN-FUNCTION))))) 

(Defrule  COMfOTATIVB-BtNARY-FUNCTlON 
'Commutative  Binary  Function* 

:RH8-Node-lVpea 

( (COMf-BIH-FUNCTlON  .  ANY-COMt-BZN-F) ) 

:  Input-Bmbedding 

(((COMIUTATIVB-BINARY-FUNCTION  1)  (COMf-BIN-FUNCTION  1)) 

( (COMfOTATIVB-BXNARY-FUNCTlON  2)  (COMf-BIN-FUNCTION  2))) 
:Output-BMadding 

( ( (COMfOTATlVB-BINARY-FUNCTlON  3 )  (COMf-BIN-FUNCTION  3 )  )  ) 

:L-R-Link  INPLBMBrTATXON 
iDoc 
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the  commutative  binary  function  ••A.* 

(FUNCTION-TYPE  ( FUNCTION- INFO  (N>  COMH-BIN-FUNCTION) ) 1) } 

(Defrule  INCRQIENT 
•Increment* 

: RMS -Node-Types 

{{COMM-INC  .  COI®IUTATIVE-BINARY-FUNCTION)J 

:  Input-^bedding 

(((INCREMENT  1)  (CONM-INC  1)}) 

: Out pu t - Bmbedd i ng 
(((INCREMENT  2)  (COMM-INC  3))) 

:L-R-Link  IMPLEMENTATION 
:0oc 

('increments  -A  by  1.* 

(INPUT-PORT-NAME>  (DOC-BP>  (INCREMENT  1))))) 

; ; ;  Figure  4-5 . 

(Defrule  COUNTING-UP 
•Counting  Up* 

: RHS-Node-^VP^s 

( {CO(»JTER  .  INCREMBNT)  ) 

: Input-finbedding 
( ( (COUNTlNG-UP  1)  (COUNTER  IJI) 

•St-Thrus 

(((COUNTING-UP  (COUNTING  2))) 

;L-R-Linlt  COMPOSITION 
:Doc 

(•repeatedly  increments  -A  by  !.• 

(INPOT-PORT-NAME>  (DOC-BP>  (COUNTING-UP  1))))) 

(Defrule  COUNT 
•Count • 

: RHS -Node -Type  s 

((ITER-COUNTING  .  COUNTING-OP) > 

: Input-Embedding 

(((COUNT  1)  (ITER -COUNTING  1))) 

: Ou t pu t - Embedd i ng 

(({COUNT  2)  (ITER -COUNTING  2))) 

!L-R-Lin)t  TEMPORAL-ABSTRACTION 
:Doc 

(•generates  a  series  of  successive  integers  starting  with  -A.* 
(INPOT-PORT-NAME>  (DOC-BP>  (COUNT  IJJJJJ 

(Defrule  BOUNDED-COUNT 
•Bounded  Count* 

:RHS-Node -Types 
( (THE-COUNTER  •  COUNT) 

(STOP-AT-LIMIT  .  BINARY -TRUNCATE) ) 

:Edge-List 

( ( (THE-COUNTER  2)  .  (STOP-AT-LIMlT  1 ) ) ) 

: I npu t - Embedd i ng 

(((BOUNDED-COUNT  1}  (THE-COUNTER  1)) 

( (BOUNDED-COUNT  2)  (STOP-AT-LIMIT  2) ) ) 

: Out put - Embedd i ng 

(((BOUNDED-COUNT  3)  (STOP-AT-LIMIT  3 )) ) 

:L-R-Link  COMPOSITION 
:Doc 

(•generates  a  series  of  successive  integers  from  -A  up  to,  but  ~ 
not  including  -A.* 

(INPUT-PORT-NAME>  (DOC-BP>  (BOUNDED-COUOT  1))) 

(INPOT-PORT-MAME>  (DOC-BP>  (BOUNDED -COUNT  2))))) 

(Defrule  DECREMENT 
•Decrement* 

: RHS-Mode 'Types 
( (SUBTRACT  .  MINUS)  I 
:  1  nput  -  aUdedd  i  ng 
(((DECREMENT  1)  (SUBTRACT  1))) 

: Output -Embedding 
(( (DECREMENT  2)  ( SUBTRACT  3 )) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

( •decrements  -A  bv  1 • * 

(XNPUT-PORT'NAME>  (DOC-BP>  (DECREMSTT  1))))) 

(Defrule  INCREMSfr-OR-DBCRSOlFr 
•increiwnt  or  Decrement* 

: RHS-Node-Types 
( (DBCREMEirrER  .  DBCRIMENT)  ) 

:  Input-finbedding 

(  (  (INCREMBOT-OR-DECRMENT  1}  (DECRBMENTER  1))) 

:  Ou  t  pu  t  -  Embedd  i  ng 

(( (INCREMENT-OR -DECREMENT  2)  (DECRBMENTER  2))) 

:L-R-Link  IKPLENENTATION 
:Ooc 

(•Increments  or  decrements  -A.* 

(INPUT-PORT-NAME  (DOC-BP>  (DECREMEOTBR  l))j)) 

(Defrule  INCRBNENr-OR-DBCRBMEPrr 
•Increment  or  Decrement* 

:  RHS-Node-Types 
(  (COUNTER  .  XMCREMENT) ) 


: Input -Embedd i ng 

(  ( (INCRtM£NT-OR-OECROfEMT  1)  (COUNTER  1})] 

:  Output  -  Eabeddi  ng 

(((INCREMElfr-OR -DECREMENT  2)  (COUNTER  2))) 

:L-R-Link  IMPLEMQ4TATION 
:Doc 

(•increments  or  decrements  -A.* 

{INPUT-PORT-NAME  (DOC-BP>  (COUNTER  1))))) 

(Defrule  DOUBLE 
•Double* 

: RHS -Node -Types 

({COMM-TIMES  .  COMMIFTATIVE-BINARY-FUNCTIONI  ) 

: Input-Embedding 

(((DOUBLE  1)  (COMM-TIMES  1))) 

:Output -Embeddi ng 
(((DOUBLE  2)  (COM4-TIMES  3))) 

:L-R-Link  IMPLOIEMTATION 
:I>oc 

(•multiplies  -A  by  2.* 

(INPOT-PORT-NAME>  (DOC-BP>  (DOUBLE  1)))M 

(Defrule  CAR-MAP 
•Car  Map* 

: RHS-Node -Types 
((MAP-HEAD  .  CAR)) 

: Input -Embeddi ng 
(((CAR-MAP  1)  (MAP-HEAD  1))) 

: Out pu t - Embeddi ng 
(((CAR-MAP  2}  (MAP-HEAD  2))) 

:L-R-Link  COMPOSITION 
:Ooc 

(•applies  the  function  CAR  to  each  element  of  the  input  series.*) 

(Defrule  SELECT-TERM 
•Select  Term* 

: RMS -Node -Type s 
((ACCESS-ARRAY  .  AREF)) 

:  Input-Emt>edding 

(((SELECT-TERM  1)  (ACCESS-ARRAY  1) 

array>seque2«:e) 

((SELECT-TERM  2}  (ACCESS-ARRAY  2))} 

: Outpu t - Embedd i ng 

(((SELECT-TERM  3)  (ACCESS-ARRAY  3))) 

:L-R-Link  IMPLQIENTATION 
:Doc 

(•selects  the  element  at  index  -A  from  the  sequence  -A.* 

( INPUT-PORT -NAME>  {OOC-BP>  (SELECT-TERM  2))) 

(INPUT*PORT-NAME>  (D0C-8P>  (SELECT-TERM  1))))) 

(Defrule  SELECT-TERH-MAP 
•Select -Term  Map* 

:RHS-Node-Types 

( (MAP-SEQUQICE-REF  .  SELECT-TERM)) 

: Input-Bnbedding 

(((SELECT-TERN-NAP  1)  (MAP-SEQUENCE-REF  1)) 

( (SELECT-TERN-NAP  2)  (MAP-SEQUENCE-REF  2)  ) ) 

: Output -Embeddi ng 

(((SELECT-TERM-NAP  3}  (MAP-SEQUENCE-REF  3))) 

:L-R-Link  COMPOSITION 
;Doc 

(•references  the  sequence  -A  at  each  index  in  the  input  series  -A 
(INPUT-PORT-NAME>  (DOC-BP>  (SELECT -TERM-MAP  1))) 

( INPUT- PORT-NAME>  (DOC-BP>  (SELBCT-TERM-MAP  2))))) 

(Defrule  FILTERING 
•Filtering* 

; RHS -Mode-Types 

((FILTBR-PREDICATE  .  TEST-PREDICATE) ) 

:  liMiut ng 

(((FILTERING  1)  (FILTER-PREDICATE  1) ) ) 

:St-Tlirus 

(((FILTERING  1)  (FILTERING  2))) 

:L-R-Link  COMPOSITION 
:I>oc 

(•repeatedly  applies  the  predicate  -'A  to  -A.* 

(FUNCTION-TYPE  (PREDICATE-INFO  (N>  FILTER-PREDICATE))) 
(INPVr-PORT-hAME>  (DOC-BP>  (FILTER-PREDICATE  1)  )))  ) 

(Defrule  FILTER 
•Filter* 

: RHS -Mode -Types 
( (FILTER-ELTS  .  FILTERING)) 

:  input -btbeddi  ng 

(((FILTER  1)  (PILTSR-BLTS  1))) 

:Output-bibedding 

(((FILTER  2)  (FILTER-ELTS  2))) 

:L-R'Link  TEMFORAL-ABSTRACTION 

:00C 

(•filters  the  elements  of  the  input  series  using  the  predicate  -A 
(fW^ION-TTPE  (PREDICATE- INFO  (N>  FILTER-ELTS))))) 
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(THE-TRUNCATE  .  TRUNCATE) ) 

:Ed0*-LlSC 

t((THE-CENERATE  2)  .  ITHE-TRUNCATE  1 ) ) ) 

: 1 npu t • BRbAdd X ng 

(((SLE  1)  (THE-GEa«ERATE  1}}) 

: Ou t pu t * Embedd i ng 

IHSLB  2)  {THE-TRUNCATE  2))} 

:L-R-Lxnk  COMPOSITION 
:Doc 

(*«nuMrAt«s  th*  succAssive  subli«t«  of  -A.* 
(lNPt7r-PORT-NANE>  (DOC-BP>  (SLE  1))))) 

(Oefrule  LE 

*Lx«c  EnuB*rftCxon* 

: RH 5 -Nod* - Type • 

((THE-SLE  .  SLE) 

(THE-CAR-MAP  .  CAR -MAP ) ) 

:£dge-List 

{((THE-SLE  2)  .  (THE-CAR-MAP  1)}) 

: Input-EBb*dding 
(((LE  1)  (THE-SLE  1))) 

:  Output  -  EBibodd  X  ng 

(((LE  2)  (THE-CAR-MAP  2) ) ) 

:L-R-Link  COMPOSITION 
:Doc 

( *en\x»*rat»s  th*  *l*n*nt8  of  -A.* 

(1MP0T-P0RT-NAME>  {DOC-BP>  (LE  1))))) 

t : ;  Figure  4-16 . 

(Defrule  ITERATIVE-SEARCH 
*lterative  Search* 

:RKS-Node-Type8 

((SEARCH-P  .  TEST-PREDICATE)) 

:  Input-asbeddxng 

(({ITERATIVE-SEARCH  1)  (SEARCH-P  I))) 

:St-ThruB 

(((ITERATIVE-SEARCH  1)  (ITERATIVE-SEARCH  2})) 

:L-R-Lxn)t  COMPOSITION 
:Doc 

(*r*peat*dly  Aopixes  the  search  predicate  -A  to  a  value.  - 
terminating  an  element  is  found  that  satisfies  it.* 
(FUNJTIOM-TYPB  (PREDICATE-INFO  (N>  SEARCH-P))))) 

; ; ;  Figure  4-17 . 


(Defrule  ACCUMULATION -DOWN 
*Accumulatxon  Down* 

: RHS-Node-Types 
( (ACCUM-F  .  ANY-BIN-F)) 

: Input-Bnbedding 

(((ACCUMULATION-DOWN  1)  (ACCUM-F  D) 

( (ACCUMULATION-DOWN  2)  (ACCUM-F  2))) 

:St-Thrgs 

(( (ACCUMULATION-DOWN  2)  (ACCUMULATION-DOWN  3] ) ) 

:L-R-Link  COMPOSITION 
:DOC 

('repeatedly  applies  the  function  ~A  to  the  result  of  its  - 
previous  appixeation  and  a  new  value.  When  the  iteration  - 
terminates,  the  result  of  the  last  application  is  returned.* 
(FUNCTION-TYPE  ( FUNCTION- INFO  (N>  ACCUM-F) ))) ) 

(Defrule  ACCUMULATE -DOWN 
'Accumulate  Down* 

: RHS -Node -Types 

{(ITER-ACCUM  .  ACCUMULATION-DOWN)) 

: I npu t - Embedd X ng 

( ( (ACCUMULATE-DOWN  1 )  ( ITER-ACCUM  1 ) ) 

{ (ACCUMULATE-DOWN  2)  (ITER-ACCUM  2) ) ) 

: Ou t pu t -  Onbedd i ng 

( ( (ACCUMULATE-DOWN  3 )  (ITER-ACCUM  3 ) ) ) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

('accumulates  the  values  of  the  input  series  'on  the  way  down’  - 
using  the  function  -A,* 

(FUNCTION-TYPE  (FUNCTION- INFO  (N>  ITER-ACCUM))))) 

(Defrule  TRUNCATION 
'Truncation* 

: RHS -Node -Types 

((STOP?  .  TEST-PREDICATE)} 

: Input-Qnbedding 
(((TRUNCATION  1)  (STOP?  1))) 

:St-Thrua 

(((TRUNCATION  1)  (TRUNCATION  2})} 

:L-R-Lxnk  COMPOSITION 
:0oc 

('repeatedly  applies  the  exit  test  -A  to  a  value,  terminating  > 
the  iteration  if  the  test  succeeds.* 

(FUNCTION-TYPE  (PREOICATB-INFO  (N>  STOP?))))) 

(Defrule  TRUNCATE 
•Truncate* 

:RH$-Node-Type8 

((ITER-TRUNCATION  .  TRUNCATION)) 

:  Input  -Em)oedding 

(((TRUNCATE  1)  (ITER -TRUNCATION  1))) 

:  Output  -Odiedd  X  ng 

{((TRUNCATE  2)  ( ITER-TRUNCATION  2) ) ) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:OOc 

(•outputs  the  elements  of  the  input  series  up  to  but  not  - 
including  the  one  that  passes  the  predicate  -A.* 

(FUNCTION-TYPE  (PREDICATE-INFO  (N>  ITER-TRUNCATION))))) 

(Defrule  BINARY-TRUNCATION 
'Binary  Trxmcation* 

:  RHS -Node -'^pe  s 

( (BINARY-STOP?  .  BINARY -TEST-PREDICATE) ) 

: Input-Embedding 

({{BINARY-TRUNCATION  1)  (BINARY-STOP?  1)) 

( (BINARY-TRUNCATION  2)  (BINARY-STOP?  2))) 

;St-ThrU8 

(((BINARY-TRUNCATION  1}  (BINARY-TRUNCATION  3))) 

:L-R-Link  COMPOSITION 
:Doc 

('repeatedly  applies  the  binary  exit  test  •A  to  a  value,  - 
terminating  the  iteration  if  the  test  succeeds.* 

(FUNCTION-TYPE  (PREDICATE-INFO  (N>  BINARY-TRUNCATION) ))) ) 

(Defrule  BINARY-TRUNCATE 
•Binary  Truncate* 

:RNS-Node-Types 

( (ITBR-BIN-TRUNCATION  .  BINARY-TRUNCATION)} 

: I npu t - Embedd i ng 

(((BINARY-TRUNCATE  I)  ( ITER-BZN-TRUNCATION  1)) 

((BINARY-TRUNCATE  2)  (ITER-BIN-TRUNCATXON  2))) 

:  Output -Bebeddi  ng 

(((BINARY-TRUNCATE  3)  (ITER -BIN-TRUNCATION  3))) 

:L-R-Link  TEMPORAL-ABSTRACTION 

:Doc 

(•outputs  the  elesients  of  the  ii^t  series  up  to  but  not  - 
including  the  one  thet  pesses  the  binary  predicate  ^A.* 
(FUNCTION-TYPE  (PREDICATE-INFO  (N>  BINARY-TRUNCATE)))}) 

(Defrule  SLE 
'Sublist  Enumeration* 

: RHS-Node-Types 
(  (THB-OINERATE  .  QBNBRATE) 


(Defrule  EARLIEST 
•Ear',  iest  • 

: RHS -Node -Type s 

((EARLIEST?  .  ITERATIVE-SEARCH}} 

:  Input-Embedding 
(((EARLIEST  1)  (EARLIEST?  1))) 

:  Ou  t  pu  t  -  Embedd  i  ng 
(((EARLIEST  2)  (EARLIEST?  2))) 

:L-R-Link  TQIPORAL-AB5TRACTION 
:Doc 

(•outputs  the  first  element  of  the  input  series  which  passes  the  - 
predicate  -A.* 

(FUNCTION-TYPE  (PREDICATE- INFO  (N>  EARLIEST?))))) 


(Defrule  SEQ-LIST-SBARCH 
'Sequential  List  Search* 

;  RHS -Node -Types 
((LIST-ENUN  .  LB) 

(SEQ-SEARCH  .  SEQUENTIAL-SEARCH)  } 

:  Edge-List 

(((LIST-ENUN  2)  .  (SEQ-SEARCH  1))) 

: Input -Embeddi ng 

(((SEQ-LIST-SBARCH  1)  (LIST-ENUN  1))) 

:  Output  -  ttibeddl  ng 

(((SBQ-LIST-SEARCH  2)  (SEQ-SEARCH  2) ) ) 

:L-R-Link  COMPOSITION 
:Doc 

('sequentially  searches  the  elements  of  the  list  -A  until  either  the 
list  is  exhausted  or  an  element  is  found  that  satisfies  the  test  -A 
( 1NPUT-P0RT-NANB>  (DOC-BP>  (SBQ-LIST-SEARCH  1)}} 


(Defrule  sequsftial-ssarch 
•Sequential  Search* 

:RH5-Node-Types 
((EXIT  .  TEST-PREDICATE) 

(SEARCH  .  EARLIEST) ) 

; Input-Bibedding 

( ( (SEQUENTIAL-SBARCH  1)  (SEARCH  1))) 

:  Output -bibeddi  ng 

( ( (SEQUENTIAL-SEARCH  2)  (SEARCH  2))) 

:L-R-Link  COMPOSITION 
;Ooc 

('finds  the  first  element  of  •>A  satisfying  the  predicate  -A,- 
unless  •'A  is  satisfied  first.* 

( INPUT- PORT-NANE>  (DOC-BP>  (SEQUBIFTIAL-SEARCH  1))) 
(FUNCTION-TYPE  (PREDICATE-INFO  (N>  SEARCH))) 

(FUNCTION-TYPE  (PRBDICATE-INFO  (N>  EXIT))))) 
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(  ( (NEW'SBQUQICE  1 )  dUkKE-SEQ  1  H  ) 

:Output  'Embedding 
( ( (NEM’SEQUOICE  2)  (HAKE-SEQ  2) 

AMAY>S£QUCMCE) ) 

:L>R-Link  IMPLEKEm'ATION 
:Doc 

(*cr*ates  a  n«w  ••quince  of  size  •'A.* 
(lNPUr-PORT-MAKE>  (OOC-BP>  (NEW'SEQC/EMCE 

(Defrulo  5EO(7E24CE>SIZE 
*Sequ»nc*  Sizo* 
rRMS'Node-Typos 

( {MEASURE'SEQUQKTE  .  ARRAY -TOTAL-SIZE)  ) 

:  Input -Eikbeddi  ng 

(((SEQtna«:E-sizE  i)  (heasure-sequimce  n 
ARRAY>SB0UQ1CE) ) 
tOutpuc -Embedding 

(( (SEQUENCE-SIZE  2)  (MEASURE-SEQUENCE  2) ) ) 

:L-R-Lin)c  IMPLEMENTATION 

:Doc 

(*conput*8  the  size  of  the  sequence  -A.* 
(1NPUT-P0RT-NANE>  (DCC-BP>  (SEQUENCE-SIZE  1))))) 


(FUNCTION-TYPE  (PREDICATE-INFO  (N>  SEQ-SEARCH) ) )  > ) 

(Defrule  CONS -ACCUMULATE -DOWN 

‘Cons  Accumulate  on  the  way  down* 

: R H S -Node -Type 8 

( (THE-ACCUM  .  ACCUHULATE-DOHN) ) 

: Input-Embedding 

( 1 (CONS-ACCUMULATE-DOHN  1 )  (THE-ACCUM  1 } ) ) 

: Output -Embedding 

( ( (CONS-ACCUMULATE-DOWN  2)  (THE-ACCUM  3 )) } 

:L-R-Link  INPLEMSITATION 
:Doc 

(•accumulates  the  elements  of  the  input  series  «A  into  a  list  ~ 
using  cons.* 

(INPUT-PORT-NAME>  (DOC-BP>  (CONS-ACCUMULATE-DOWN  1))))) 

(Defrule  REVERSE-LIST 
*Reverse  List* 

:RHS-Mode-Types 
{ (ENUMERATE-LIST  .  LE) 

(ACCUM-LIST  .  CONS-ACCUMULATE-DOWN)) 

.‘Edge-List 

(( (ENUMERATE-LIST  2)  .  (ACCUM-LIST  1) ) ) 

: Input-Embedding 

(((REVERSE-LIST  1}  (ENUMERATE-LIST  1))) 

: Output -Onbeddi ng 

({(REVERSE-LIST  2)  (ACCUM-LIST  2))) 

;L-R-Lin)i  COMPOSITION 
:Doc 

(‘constructs  a  list  containing  the  elements  of  "A  in  reverse.* 
(IMPUT-PORT-NAME>  (DOC-BP>  (REVERSE-LIST  1))))) 

(Defrule  TRAILING-GENERATION 
■Trailing  Generation* 

:  RHS -Node  -  Type  s 

( (TR-GEN-FUNCTION  .  ANY-GEN-F) ) 

: I npu t - Embedd i ng 

(( (TRAILING-GENERATION  1)  (TR -GEN-FUNCTION  1 )) } 

.‘Output -Qibedding 

(((TRAILING-GENERATION  3)  (TR-GEN-FUNCTION  2))) 

:St-Thrus 

(((TRAILING-GENERATION  1)  (TRAILING-GENERATION  2))) 

{L-R-Lin);  COMPOSITION 
:Ooc 

('generates  the  successive  previous  and  current  elements  of  ^A  * 
by  repeatedly  applying  the  function  »A  to  the  result  of  - 
the  preceding  application  of  that  function.* 

{1NPUT-P0RT-NAME>  (DOC-BP>  (TRAILING-GEWERATION  DU 
(FUNCTION-TYPE  (FUNCTION -INFO  (N>  TR-GBJ-FUNCTION) ) ) ) ) 

(Defrule  TRAILING-GENERATE 
•Trailing  Generate* 

: RHS-Node-Types 

( (ITER-TRAILING-GOJ  ,  TRAILINC-COJERATION)  ) 

: Input-Embedding 

(  ( (TRAILZNG-GSfSRATS  1)  (ITSR-TRAILING-GEN  1})) 

: Output -  Bnbeddi ng 

(( (TRAILING-GENERATE  2)  (ITER-TRAILING-GEN  2) J 
((TRAILING-GENERATE  3)  (ITER-TRAILING-GQi  3))) 

:L-R-Lin)t  TOIPORAL-ABSTRACTION 
:Doc 

(•generates  a  series  of  the  elements  of  -A  and  a  series  of  the  - 
elements  immediately  preceding  each  of  the  elements  in  that 
series. ' 

(INPOT-PORT-NAIIE>  (DOC-BP>  (TRAILING-GENERATE  1))))) 

(Defrule  TRAILING-PTR-LE 

•Trailing  Pointer  List  Enumeration* 

: RHS-Node -Types 

((TR-GEN  .  TRAILING-GENERATE) 

(PREVIOUS-CAR-MAP  .  CAR-MAP) 

(CURRENT-CAR-MAP  .  CAR-MAP) 

(NULL-TRUNC  .  TRUNCATE) ) 

:Edge-List 

{im-GBS  3)  .  (CURAENT-CAR-MAP  1)) 

((TR-OEN  3)  .  (NULL-TRUNC  1)) 

((TR-GEN  2)  .  (PREVIOUS-CAR -MAP  1))) 

:  input-Bsibedding 

{ { (TRAILING-PTR-LB  1)  (TR-GEN  1))) 

: Output -Embeddi ng 

(( (TRAILING-PTR-LE  2)  (PREVIOUS-CAR-NAP  2) ) 

( (TRAILING-PTR-LE  3)  (CURRIMT-CAR-NAP  2) ) ) 

:L-R-Lin)c  COMPOSITION 
:Doc 

(•enumerates  the  elements  of  the  list  -A.  along  with  their  - 
ismiediately  preceding  elements.* 

( INPUT-PORT-NANE>  (DOC-BP>  (TRAILING-PTR-LE  1))))) 


(Defrule  NEW-TERM 
•New  Term* 

:RHS-Node-Types 

((THE-CR  .  COPY-REPLACE-ELT) ) 

: Input-Embedding 
(((NEW-TERM  1)  (THE-CR  3) 

ARRAY>SEQUENCE) 

((NEW-TERM  2)  (TOE-CR  2)) 

((NEW-TERM  3)  (THE-CR  1))) 

:Output -Embeddi ng 
(((NEW-TERM  4}  (THE-CR  4) 

ARRAY>SEOUENCE) ) 

:L-R-Lin)c  IMPLQONTATION 
:Doc 

(•creates  a  new  sequence  with  the  same  elements  as  the  input  sequence  - 
-A  at  the  same  locations,  except  that  the  element  -A  is  at  the  - 
index  -A.* 

(INPOT-PORT-NAME>  (DOC-BP>  (NEW-TERM  1))) 

(lMPUT-PORT-NAME>  (DOC-BP>  (NEW-TERM  3))) 

(INPUT-PORT-HAME>  (DOC-BP>  (NEW-TERM  2))))) 

(Defrule  SEQUENCE-ACCUMULATION 
•Sequence  Accumulation* 


: RK S -Node -Type s 

((THE-NT  .  NEW-TERM)) 

t Input-Embedding 

( ( (SEQUQICE-ACCUMULATION 

1) 

(THE-NT 

3)) 

( (SEQUQICE-ACCUMULATION 

2) 

(THE-NT 

2)) 

( (5EQUB1CE-ACCIMULAT10N 

3) 

(THE-NT 

1))) 

:St-Thru8 

( ( (SEQUENCE-ACCUMULATION 

3) 

(SEQUENCE-ACCUMUIATION  4))) 

:L-R-Lin)c  COMPOSITION 
:Doc 

(•repeatedly  inserts  an  element  ^A  (a  new  element  on  each  iteration)  - 
in  t).  sequence  -A  at  the  location  -'A  (which  is  a  different  index  on  - 
each  iteration) .  When  the  Iteration  terminates,  the  sequence  - 
resulting  from  the  last  insertion  is  returned.” 

(INPOT-PORT'NAM£>  (DOC-BP>  (SEQUENCE-ACCUMULATION  l}j) 

(INPOT-PORT-NANE>  (DOC-BP>  (SEQUENCE-ACCUMULATION  3))) 

(INPUT-PORT-NAMB>  (DOC-BP>  ( SBOUBICB-ACCUMULATION  2))))) 


(Defrule  SEQUENCE-ACCUMULATE 
•Sequence  Accumulate* 

:  RHS -Node -lype  S 

( (ARRAY-ACCUM  .  SEQUENCE-ACCUMULATION)) 

:  I  npu  c  -  Etobedd  i  ng 

(((SEQUENCE-ACCUMULATE  1)  (ARRAY-ACCUM  1}} 

( (SBQUENCE-ACCUNULATE  2)  (ARRAY-ACCUM  2) ) 

( (SEQUENCE-ACCUMULATE  3)  (ARRAY-ACCUM  3) ) ) 

; Ou t pu t - Embedd i ng 

(((SEQUBNCE-ACCUHUU^TE  4)  (ARRAY-ACCUM  4))) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:DOC 

(•accumulates  the  values  of  the  input  series  -A  into  a  sequence  -A  at  the 
series  of  indices  -A.* 

(INPUT-PORT-NANE>  (DOC’BP>  (SEQUENCE-ACCUMULATE  1))) 

(INPUT-PORT-NANE>  (DOC-BP>  (SEQUENCE-ACCUMULATE  3))) 

(1NPUT-P0RT-NAM£>  (DOC-BP>  (SEQUENCE-ACCUMULATE  2))))) 

(Defrule  SEQUENCE-ENUMERATION 
•Sequence  Enumeration* 

:RHS-Node-Types 

((GENERATB-INDICSS  .  BOUNDED-COUNT) 

(CrOMFUTB-INDBX-LIHIT  .  SEQUENCE-SIZE) 

(ACCBSS-SBQUBICE  .  SBLBCT-TERM-NAP) ) 

:Edge-Li8t 

( ( (GBNBRATE-INDICBS  3)  .  (ACCESS-SEQUENCE  2)) 

((CONFUTB-INDEX-LIMIT  2)  .  (GBNBRATE-INDICES  2)  )  ) 

: Input -Bibeddi ng 

(((SEQUB4CB-O1UNERATI0N  1)  (ACCESS-SEQUENCE  1)) 

((SEQUINCB-SNUNERATION  1)  (COMPUTE- INDEX-LIMIT  1))) 


(Defrule  NCW-SBQUBICB 
•New  Sequenoe* 

: RHS-Node-Types 
{(KAXE'SEQ  .  MAKE-ARRAY)) 
: Input-Embedding 
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:  Output '  EBib«ddi  ng 

(  ( (SEQUENCE-EMUMERATION  2)  (ACCESS-SEQUENCE  3) )  ) 

:L-R-Link  COMPOSITION 
:Doc 

( **num«r«t*8  the  elenents  of  the  sequence  -A.* 

(1NPUT-P0RT-NAME>  (OOC-BP>  (SEQUENCE-SNUMERATION 

(Defrule  SEQUENCE-AND-ZNDEX-ENUMERATION 
*Sequence  and  Index  Enuaeration* 

: RHS -Node -Types 

( (6ENERATE-INDZCES  .  BOUNDED-COUNT) 

(COMPUTE-INDEX-LIMIT  .  SEQUIMCE-SIZE} 

(ACCESS-SEQUENCE  .  SELECT-TERM-MAP) ) 

:Edge-List 

( ( (GENERATB-INDICES  3)  .  (ACCESS-SEQUENCE  2) ) 

( (COMPUTE-INDEX-LIMIT  2)  .  (GQ4ERATE-INDICES  2) ) ) 

: Input-Qnbedding 

(  (  (SEQUENCE-AND-INDEX-DIUMERATION  1)  (ACCESS-SEQUENCE  1)) 

( (SEQUmrE-AND-INDEX-ENUMERATION  1)  (COMPUTE-INDEX-LIMIT  1))) 

: Output -Enbeddi ng 

(  ( (SEQUBNCE-AND-INDEX-ENUNERATION  2)  (ACCESS-SEQUENCE  3) ) 

( (SBQUENCE-AND-INDBX-ENUMERATION  3)  (GfiNERATE-INDICES  3))) 
:L-R-LinK  COMPOSITION 
:Doc 

(‘enumerates  the  elements  of  the  sequence  -A  and  their  indices.* 

( INPUT-PORT-NAME> 

(OOC-BP>  (SEQUENCE-AND-INDEX-ENUMERATION  1))))) 

(Defrule  LIST-TO-SEQUENCE 
‘Transfer  List  to  Sequence* 

:RHS-Node-Types 
( (ENUMERATE-LIST-ELTS  .  LE) 

(NEW-BASE  .  NEW-SEQUENCE) 

(COUNT-INDICES  .  COUNT) 

(ACCUMULATE -SEQUENCE  .  SEQUENCB-ACCUMULATE) ) 

: Edge-List 

(( (ENUMERATE-LIST-ELTS  2)  .  (ACCUMULATfi-SEQUENCE  1 ) ) 

((NEW-BASE  2]  .  (ACCUMULATE-SBQUB«:E  3)) 

((COUNT- INDICES  2)  .  (ACCUMULATE-SEQUENCE  2) ) ) 

; Input-Babedding 

(((LIST-TO-SEQUENCE  1)  (ENUMERATE-LIST-ELTS  1)) 

((LIST-TO-SEQUENCE  2)  (NEW-BASE  1))) 

;  Ou  t  pu  t  -  EaUdedd  i  ng 

(((LIST-TO-SEQUENCE  3)  (ACCUMULATE-SEQUENCE  4))) 

:L-R-Linl(  COMPOSITION 
:Doc 

(‘transfers  the  elements  in  the  list  -A  into  a  sequence~4~ 
of  size  »A,  by  enuswrating  the  elements  of  the  list  -k- 
and  accumulating  them  in  the  sequence  at  successive  indices, '•I- 
starting  with  index  -'A.* 

(1MPUT-P0RT-NAME>  (OOC-BP>  (LIST-TO-SEQUENCE  1))) 
(INPUT-PORT-NAKE>  (OOC-BP>  (LIST-TO-SEQUENCE  2))) 
(ZNPUT-PORT-NAICE>  (DOC-BP>  (COUNT- INDICES  1))))) 

(Defrule  UNARY-PREDICATE 
‘Unary  Predicate* 

:  RHS-Node-TVP^* 

((ANY-PRED  ,  ANY-P)) 

:  Input-EmJdedding 

(((UNARY-PREDICATE  1)  (ANY-PRED  1))) 

: Output -Bmbeddi ng 

(( (UNARY-PREDICATE  2)  (ANT-PRED  2) ) ) 

:L-R-Link  IMPLEMENTATION 
:Doc 

(‘applies  the  unary  predicate  -A  to  -'A.* 

(PUNCTION-TYPE  (FUNCTION- INFO  (N>  ANY-PRED))} 

(INPUT-PORT-NANB>  (DOC-BP>  (ANY-PRED  1))))) 

(Defrule  TEST-PREDICATE 
•Test  Predicate* 

:RHS-Node-lVP«« 

( (TP-UNARY-P  .  UNARY-PREDICATE) 

(CHECK-IT  .  NULL-TEST)) 

: Edge-List 

(( (TP-UNARY-P  2)  .  (CHECK-IT  1))) 

:  Input-Bibedding 

(((TEST-PREDICATE  1)  (TP-UNART-P  1))) 

:L-R-Lin)c  COMPOSITION 
:Doc 

(‘tests  -A  using  the  unary  predicate  -A.” 

(IMPl7r-PORT-NAMB>  (DOC-BP>  (TEST-PREDICATE  1))) 

(FUNCTION-TYPE  (FUNCTION-INFO  (N>  CHBCK-IT) ) ) ) ) 

(Defrule  BINARY-PREDICATE 
‘Binary  Predicate* 

:RH8-Node-Types 

(  (ANY-BIN-PREO  .  ANY-BINARY-P) ) 

:  1  nput  -  bibedd  i  ng 

(( (BINARY -PREDICATE  1)  (ANY-BIN-PREO  1)) 

( (BINARY-PREDICATE  2}  (ANY-BIN-PREO  2) ) ) 

: Output -Bmbeddi ng 

(((BINARY-PREDICATS  3)  (ANY-BIN-PRBD  3))) 

:L-R-Lin)c  IMPLEMENTATION 


:Doc 

(‘applies  the  binary  predicate  -A  to  -A  and  -A.* 
(PUNCTION-TYPE  (FUNCTION- INFO  (N>  ANY-BIN-PRED) )  / 
(IHPUT-PORT-NAME>  (OOC-BP>  (ANY-BIN-PRED  1))) 
(lNPUT-PORT-NAME>  (DOC-BP>  (ANY-BIN-PRED  2)  )}) ) 

(Defrule  BINARY -TEST-PREDICATE 
‘Binary  Test  Predicate* 

:RHS-Node-Types 

( (TP-BINARY-P  .  BINARY-PREDICATE) 

(NULL-CHECK  .  NULL-TEST) ) 

:Edge-List 

(((TP-BINARY-P  3)  .  (NULL-CHECK  1))) 

: Input-Embedding 

(((BINARY -TEST-PREDICATE  1)  (TP-BINARY-P  1)) 

((BINARY -TEST-PREDICATE  2)  (TP-BINARY-P  2))) 

:L-R-Linic  COMPOSITION 
:Doc 

(‘tests  -A  and  -A  using  the  binary  predicate  -A.‘ 
(INPUT-PORT-NAMB>  (DOC-BP>  (BINARY -TEST-PREDICATE  1))) 
(IMPUT-PORT-MAME>  (DOC-BP>  (BINARY -TEST-PREDICATE  2 )) ) 
(FUNCTION-TYPE  (FUNCTION- INFO  (N>  NULL-CHECK)  )  )  )  ) 

(Defrule  SUMING 
‘Susning* 

: RHS -Mode -Type 8 

((THE-TALLY  .  COMATTATIVE-BINARY-FUNCTION)  ) 

:  Input  -Embedding 
(((SUMIING  1}  (THE-TALLY  1)) 

((SUIMIMG  2)  (THE-TALLY  2))) 

:St-Thrus 

(((SUMIING  2}  (SUMIING  3))) 

:L-R-Lin)c  COMPOSITION 
:Ooc 

(‘keeps  a  running  total  of  the  numbers  -'A.* 
(lNPUr-PORT-MAME>  (DOC-BP>  (5UMIIMG  1))))) 

(Defrule  SUM 
•Sum* 

:RHS-Node-Types 
((TALLYING  .  SUMMING)) 

$  Input -bibedding 
(((SUM  1)  (TALLYING  1))) 

;  Output  -  bibedd  i  ng 
( ( (SUM  2}  (TALLYING  3 ) ) ) 

:L-R-Link  TEMPORAL-ABSTRACTION 
:Doc 

(‘returns  the  sum  of  the  numbers  in  the  input  series  -A. 
(ZNPUT-PORT-NAM£>  (OOC-BP>  (SUM  I))))) 

(Defrule  MAX 
•Maximum* 

iRHS-Node-iypes 

( (CONPUTE-NAX  .  BINARY -TEST-PREDICATE)  ) 

t  Input  -b09eddi  ng 

(((MAX  1}  (COMPUTE-MAX  1)) 

((MAX  2)  (CONPUTB-MAX  2))) 

:St-Thrus 
(((MAX  2)  (MAX  3)) 

((MAX  U  (MAX  3))) 

:L-R-Link  INPLBMBrrATION 
:Doc 

(‘computes  the  maximum  of  -A  and  -A.* 

(INPtfr-PORT-NANE>  (DOC-BP>  (MAX  1))) 

(INFVr-PORT-NAME>  (DOC-BP>  (MAX  2))})} 

(Defrule  MIN 
‘Minimum* 

:  RHS -Node -Types 

(  (CONPUTE-MIN  .  BINARY -TEST-PREDICATE) } 

:  input-bibedding 
(((MINI)  (CONPUTE-MIN  I)  ) 

((MIN  2)  (CONPUTE-MIN  2})) 

:St-ThnjS 
(((MIN  2)  (MIN  3)) 

((NIN  1)  (MIN  3)}) 

:L-R-Link  INPLENBYTATION 
:Doc 

(‘computes  the  minimum  of  -A  and  -A.* 

(INPUr-PORT-NANE>  {DOC-BP>  (MAX  1))) 

(INPUr-PORT-NANE>  (DOC-BP>  (MAX  2))))) 

;;;  Figure  3-9. 

(Defrule  SgUARB-ROOT-OF-SQUARE 
‘Square-Root  of  Square* 

:  RHS -Mode -Types 
((SO  .  SQUARE) 

(TAXt-ROCrr  .  SORT)  ) 

:Edge-List 

(((80  2)  .  (TARE-ROOT  1))) 

:  1  npu  t  -  bibedd  i  ng 

(((8QUARE-ROOT-OF-SQUARE  1)  (SQ  1))) 

:Output-Bmbedding 
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( ( (6QUARE>R00T-0F-SQUARE  2)  (TARE-ROOT  2))) 

:L-R-Link  COMPOSiriON 
:Doc 

(*co«^uc*«  th*  •Quar*  root  of  tho  oquor*  of  ~A* 

(INPUT-PORT-NAM£>  (DOC-BF>  (SQUARE-ROOT-OP-SQUARE  1)})}) 

Piguros  3-9,  4-4. 

(Dofrule  NEGATE-IP-HEGATIVE 
*Nogate  if  Nogativo* 

: RHS -Nod* -Typo a 
((NEGATIVE?  .  LT) 

(CONTROL-NEGATION  .  NULL-TEST) 

(THE-NEGATE  .  NEGATE)  ) 

.'Edga-Lxat 

({(NEGATIVE?  3)  .  (CONTROL-NEGATION  1))) 

: Input -Embedding 

( (  (NEGATE-IF-NBGATIVE  1)  (THE-NEGATE  1) ) 

( (NEGATE-IF-NBGATIVE  1)  (NEGATIVE?  1))) 

: Output - Bmbeddi ng 

( ( (NEGATE-IP-NEGATIVE  2)  (THE-NEGATE  2 ))  ) 

;St-‘niru8 

( ( (NEGATE-IF'NEGATIVE  1)  (NEGATE-IF-NEGATIVE  2)}} 

:L-R-Link  COMPOSITION 
:DOC 

('negates  -A  if  its  negative.* 

{INPOT-PORT-NAME>  (DOC-BP>  (KEGATE-IP-NEGATIVE  1)))>) 

;;;  Figure  3-9. 

(Defrule  ABSOLUTE-VALUE 
■Absolute  Value* 

:  RHS -Node -Type  8 

((SQRT-OF-SQ  .  SQUARE-ROOT-OF-SQUARE) ) 

: Input -Embedding 

{((ABSOLUTE-VALUE  1)  (SQRT-OF-SQ  1))) 

:  Ou  t  pu  t  -  &nbedd  i  ng 

( ( (ABSOLUTE-VALUE  2)  (SQRT-OF-SQ  2) ) ) 

:L-R-Link  IMPLEMENTATION 
:Ooc 

(■eoiqputes  the  absolute  value  of  -A  by  taking  the  square  root  of 
its  square.* 

(IMPOT-PORT-NAME>  (DOC-BP>  (ABSOLUTE -VALUE  1))))) 

;;;  Figure  3-9. 

(Defrule  ABSOLUTE-VALUE 
‘Absolute  Value* 

:RHS-Node-Types 

((NIN  .  NEGATE- IP -NEGATIVE)  ] 

:  I  nput  -  ad>eddi  ng 
(((ABSOLUTE-VALUE  1)  (NIN  1))) 

: Output - Oibeddi ng 
(((ABSOLUTE-VALUE  2)  (NIN  2))) 

?L-R-Link  IMPLOfBNTATION 
:Doc 

(*cog^tes  the  absolute  value  of  -A  by  negating  it  if  it  is  - 
negative.* 

(INPUT-PORT-NAHE>  (DOC-BP>  (ABSOLUTE-VALUE  1})))) 
i ; ;  Figure  3-9. 

(Defrule  EQUALITY-WITHIN-EPSILON 
■Equality  Within  an  Epsilon* 

:RHS-Node-TVP«S 
((DIFP  .  MINUS) 

(TAKE-ABS  .  ABSOLUTE-VALUE) 

(WITHIN-EPSILON  .  LTE) 

(TEST-EWE  .  NULL-TEST)) 

:Bdge-List 

(((DIFF  3)  .  (ABSOLUTB-VALUB  1)) 

( (HITHIN-EPSILON  3)  .  (TEST-EWE  1 )) } 

: Input-Embedding 

( ( (EQUALlTY-WlTHlN-fiPSILON  1)  (DIFF  1)) 

((EQUALm-WlTHIN-EPSILON  2)  (DIFF  2))) 

:L-R-Link  COMPOSITION 
:DOC 

('determines  idiether  -'A  and  -A  are  within  an  epsilon  '-A  of  each  - 
other. * 

(INPUT-PORT-NAME>  (DOC-BP>  (EQUALITY-WITHIN-EPSILON  1 )}  ) 
(INPUr-FORT-NAia>  (OOC-BP>  (EQUALITY-WITHIN-EPSILON  2))) 
(INPUT-PORT-NAME>  (OOC-BP>  (EQUALITY-WITHIN-EPSILON  3))})) 
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Index  of  Non-Terninel  Node  Typee  299 

314  (ABSOLOTE ‘VALUE  1:IOTEGBR  2: INTEGER) 

311  (ACCIMULATE'OOWN  1:SERIES  2:ANY  3:ANy)  298 

309  (ACCUMULATE‘UP  1:SER1ES  2:ANY  3:ANY) 

311  (ACCUMULATION-DOWN  1 rANY  2: ANY  3: ANY) 

309  (ACCUMULATION-U?  1:ANY  2:ANY  3:ANY)  296 

293  (ADVANCE-NODES  1:SBQUCMCE  2:SEaUBNCE  3:QUEUE) 

304  (ASSOCIATIVE-LIST-DELBTE  1:ANY  2 :ASS0C1ATIVE-LIST 

3:ASSOCIATIVE-LIST)  291 

304  (ASSOCIATIVE-LIST-INSERT  1:ANY  2:ANY  3 :ASSOCIATIVF  LIST 

4:ASSOCIATIVE-LIST]  293 

304  (ASSOCIATIVE-LIST-LOOKUP  1:ANY  2:ASS0CIAT1VE-L7  :ANY) 

301  (ASSOCIATIVE-SBT-ADD  1:ANY  2:ANY  3  :ASSOCIATIVE-oi£T  303 

4:ASSOCIATIVE-SBT)  303 

302  (ASSOCIATIVB-SET-LOOKUP  1:ANY  2 :ASSOCIATIVB-SET  3:ANY)  303 

302  (ASSOCIATIVE-SBT-REMOVE  1:ANY  2: ASSOCIATIVE-SET  301 

3 :ASSOCIATIVE-SBT)  299 

294  (AVERAGE-LOCAL-BUPFER-SIZE  1:SEQUENCE  2:ItrrEGER) 

313  (BINARY-PREDICATE  1:ANY  2:ANY  3:ANY) 

313  (BINARY-TEST-PREDICATE  1:ANY  2: ANY)  300 

311  (BINARY-TRUNCATE  1:SBRIBS  2:ANY  3:SBRIES} 

311  (BINARY -TRUNCATION  1:ANY  2: ANY  3: ANY)  299 

297  (BOUNDED-CIS-ENUMERATION  1 :CIRCULAR-1ND£X£0-SEQUENCE  298 

2: INTEGER  3: INTEGER  4: INTEGER  296 

S:SBRIES)  296 

310  (BOUNDED-COUNT  ItlNTEGER  2:INTEGER  3:SBRIBS)  298 

301  (BUMP+UPDATE  1:ANY  2 : INDEXED-SEQUCMCE  3 : INDEXBD-SEQUEMCE)  297 

310  (CAR-MAP  1:SERIES  2:SBRIBS)  310 

303  (CHAINING-HT-DELETE  1:ANY  2:HASH-TABLE  3 :HASH-TABL£)  310 

303  (CKAINING-HT-FILL-COUNT-DELETE  1:ANY  2:HASH-TABLE  306 

3: HASH -TABLE) 

304  (CHAINING-KT-FILL-COUNT- INSERT  1:ANY  2:ANY  3:HASH-TABLE  306 

4: HASH-TABLE) 

303  (CHA1N1N6-HT-INSERT  1:ANY  2:ANY  3:HA8H-TABLE  4 :HASH-TAfiLE)  309 

302  (CHAINING-HT-LOOKUP  1 :ANY  2:HASH-TABLE  3:ANY)  291 

297  (CIRCULAR-INDEXEO-SBQUENCE-ENUMBRATION 

liClRCULAR-INDEXBD-SEQUENCE  2:SERIES)  294 

297  (CIS-ADD  1 :ANY  2 :CIRCULAR-INDEXBD-8EQUENCE 

3:C1RCULAR-INDBXED-SEQUENCE)  309 

296  (CIS-DESTRUCTIVE-EMUNERATION  1  :CIRCULAR-INDEXBD-SEQUENCE  293 

2:SERIBS)  297 

296  (CIS-EMPTY  1  ;C1RCULAR-INDBXBD-SEQUDICE)  299 

298  (CIS-EXTRACT  1 :CIRCULAR-INDBXBD-SBQUBICE  2:ANY  302 

3  sClRCULAR-INDEXED-SBQUDICE)  302 

296  (CIS-FULL  1 :CIRCULAR-INDEXBD-SEaUENCB)  302 

291  (CO-EARLIEST-EDS-FINISHBD  1:SERIES  2:SERIES  3:SEQUENCB)  310 

291  (CO-ITERATIVE-EDS-FINISHED  1 : PRIORITY -QUEUE  2:SEQUniCE  3tANy)  310 

297  (COMBINATION-FUNCTION  1:  INTEGER  2:  INTEGER  3:  INTEGER)  301 

309  (COMMUTATIVE-BINARY -FUNCTION  1  :ANY  2:ANY  3:ANY) 

312  (CONS-ACCUMULATE-OOWN  liSERIES  2 :LINKBD-LI$T)  301 

309  (CONS-ACCUMULATE-UP  1:SBRIES  2iLINKBD-LIST) 

309  (CONS-ACCUNULATE-UP-PRON-SUBLIST  1:SBRIBS  2:LINKED-LIST  301 

3:LINKED-L1ST) 

310  (COUNT  1$INTEGER  2:SERIES)  297 

310  (COUNTlNG-UP  l:INrEGER  2:INTEGER) 

310  (DECREMENT  1:1NTBGER  2:INTEGBR)  305 

292  (OELIVER-KESSAGE  1:KESSAGE  2:SEQUBNCB  3:SEQUI»1CE) 

292  (DELIVER-MESSAGE-ACCUNULATE  1:SERIES  2:SBQUENCE  3:SBQUBNCE)  300 

292  (DELIVER-MESSAGES  1:QUEUE  2:SEQUENCE  3:SBQUBNCE) 

294  (DELIVER-NESSAGES-AND-STEP-NODES  1:SEQUENCE  2:QUEUE  298 

3:$BQUENCE  4 : QUEUE) 

291  (DEQUEUE-AND-PROCESS-GSIERATION  1 :  PRIORITY -QUEUE  2:SBQUENCE  311 

3: PRIORITY -QUEUE  4:SEQUENCE)  311 

294  (DESTRUCTIVE-QUEUE-BWNERATIGN  1:QUEUE  2:SER1ES)  309 

293  (DO-WORK -ACCUMULATE  1:SERIBS  2:INTBGBR  3:SEQUENCE  4:QUEUE  309 

5:SEQUENCE  6:QUEUE)  307 

293  (DO-WORK -ACCUMULATION  1:SYNCH-NQDE  2:INTBGER  3:SEQUEIICE  313 

4:QUEUE  5:8BQOBNCE  6:QUBUB)  300 

310  (DOUBLE  1: INTEGER  2: INTEGER)  300 

312  (EARLIEST  1  .-SERIES  2;ANr)  300 

308  (BARLIEST-EQUAL-PRIORITY  IsSERIES  2:ANY  3:ANY) 

308  (EARLIEST-EQUAL-PRIORITT-HEAD  1:8BRIBS  2:ANY  300 

3 :OROERBD-ASSOCIATIVE-LIST}  292 

308  (EARLIEST-OAL-POSITION  1:SBRIES  2:ANT  292 

3:ORDERBD-ASSOCIATIVB-LIST)  292 

294  (EARLIEST-SIMULATIQN-FINISHBD  1:S1QU8HCE  2:QU1UE  3:SEQUBICE)  292 

308  (ENFTY-OR-LOW-PRIORITY-HBAD  1 :«lOtRED-ASSOCIATIVE-LIST  2:ANT)  293 

298  (ENUM-EVAL-COLLECT  1 :LINKED-LI8T  2:SBQUBICB  293 

3  :KXEarTION-CONTBXT  4:aUEUB  5:LINRED-LIST  300 

6:S1QUENCB  7 : EXECOTION-CONTEXT  SjQUEUE) 

293  (ENUM-NODES4CNECK-BUFFIRS  1:SBQUBICB)  304 

306  (ENUM-OAL-FRONT  1 :0RDBRE0-A880C1ATIVE-LI8T  2:ANY  3:StRIES)  299 

306  (ENUM-OAL-FRONT-UNSAFB  1  :OROERBD-ASSOClATIVB-LlST  2:ANY  299 

3:8BR1B8)  292 

292  (BNUMBRATB-AND-DBLIVBR -MESSAGES  IsQUEUE  2:8SQUBNCB  313 

3:SSQUENCS)  313 

294  (BIUNERATB-NODBS^COMPUTS-AVERAGB  l:SBQOBICE  2: INTEGER)  314 

308  (EQUAL-PRIORITY-HBAD  1 :0R0ERED-A880CIATIVB-LIST  2:ANy)  312 

308  (EQUAL-PRIORITY -TEST  ItANY  2:ANT)  312 

314  (EQUALITY -WITHXN-EPSILON  1: INTEGER  2: INTEGER)  307 

307 


(EVALUATE-AND-APPLY  ItSYMBOL  2 : LINKED-LIST  3:SEQUENCE 
4:EXECUTION-CONTEXT  5:QUEUE  6:ANY 
7 {SEQUENCE  8 {EXECUTION-CONTEXT  9 {QUEUE) 
(EVALUATE-ARGUMENTS  1 { LINKED-LIST  2{SEQUENCE  3 {EXECUTION-CONTEXT 
4{QUEUE  5:LINKED-LIST  6:SEQUENCE 
7:EXECim ON-CONTEXT  8 {QUEUE} 

(EVALUATE-MAP  1 {SERIES  2 {SEQUENCE  3 { EXECOTION-CONTEXT  4 

{QUEUE  5  {SERIES  6{SEQUZMCB  7  { EXECOTION-CONTEXT 
8{QUEUE) 

(EVENT-DRIVEN-SIMULATION  1  {EVENT  2  {PRIORITY -QUEUE  3  {SEQUENCE 
4 {SEQUENCE) 

(EXTRACT-AND-HANDLE-FIRST-MESSAGE  1 {SYNCH-NODE  2 .INTEGER  3 {SEQUENCE 

4 {QUEUE  5 {SEQUENCE  6 {QUEUE) 
(PETCH^^DELETE  1{ANY  2  {HASH-TABLE  3  {HASH-TABLE) 

(FETCH^INSERT  1{ANY  2 {ANY  3 {HASH-TABLE  4 {HASH-TABLE) 

(PerCH^LOOKUP  1{ANY  2{HASH-TABLE  3 {ANY) 

(PETCH<fUPOATE  IzINDEXED-SEQUENCE  2  {ANY  3  { INDEXED-SEQUENCE) 
(PETCH-AND-APPLY-OPERATOR  l{SYMBOL  2 {LINKED-LIST  3{SEQUENCE 
4  {EXECOTION-CONTEXT  5  {QUEUE  6  {ANY 
7{SEQUENCE  8 {EXECUTION-CONTEXT  9{QUEUE) 
(FETCH-INSTRUCTION  1 { INTEGER  2 {SEQUENCE  3 { INSTRUCTION 
4 { INDEXED-SEQUENCE) 

(FETCH-OP  1{ SYMBOL  2 {OPERATOR) 

(FIFO-DEQUEUE  IrPIFO  2{ANY  3{FIPO) 

(FIPO-DESTRUCTIVE-DRMERATION  l{PIPO  2 {SERIES) 

(F1PO-QIPTY7  l{FIPO) 

(PIPO-ENQUEUE  1:ANY  2{PIPO  3{PIFO) 

(FIPO-ENUMERATION  l{FIPO  2{SERX£S) 

(FILTER  1:SBRIES  2{SBRIES) 

(FILTERING  1:ANY  2{ANY) 

(FIND-OAL-TAIL  1 {ORDERED-ASSOCIATIVE-LIST  2{ANY 
3  {ORDBRBD-ASSOCIATIVE-LIST) 

(FIND-OAL-TAIL-UNSAPE  1  {ORDERED-ASSOCIATIVE-LIST  2 {ANY 
3  {ORDERED-ASSOCIATIVE-LIST) 

(GENERATE  1{ANY  2 {SERIES) 

(GBIERATE-EVENT-QUEUBS-AND-NODES  1  {PRIORITY-QUEUE  2{SEOUENCE 

3{SBRIES  4{SERIES) 

(GENBRATE-GLOBAL-BUFFERS-AND-NODES  1;SEQUENCE  2{QUEU£  3{SERIES 

4{SBRIES) 

(GENERATION  1  {ANY  2:ANY) 

(GLOBAL-AND-LOCAL-BUFFERS-EMPTY?  ItSBQUSlCE  2{0UEUE) 

(GROW-CIS  1:CIRCULAR-1NDEXED-SEQUBICE  2 {CIRCULAR-INDEXED-SEQUENCE) 
(HANDLE-MESSAGE  1  {MESSAGE  2:SEaUS4C£  3:QUEUE  4{SEQUENCE  5{0UEUE) 
(HASH-DELETE  1 {ANY  2 {HASH-TABLE  3 :HASH-TABLB) 

(HASH-INSERT  1 {ANY  2:ANT  3:HASH-TABLE  4 {HASH-TABLE) 

(HASH-LOOKUP  1  {ANY  2  {HASH-TABLE  3:AMY) 

(INCREMENT  1: INTEGER  2 {INTEGER) 

(INCREMENT-OR-DECREMEMT  ItlNTBGBR  2;INTBGER) 
(INDBXBD-SBQUEMCE-ACCUNULATION  1  {SERIES  2 { INDEXED-SEQUDICE 

3 t INDEXED-SEQUENCE) 
(IKDEXED-SEQUENCE-EXTRACT  1  {INDEXED-SEQUENCE  2 {ANY 
3 { INDEXED-SEQUENCE) 

(INDEXED-SEQUBNCE-INSERT  1 {ANY  2 {INDEXED-SEQUENCE 
3 { INDEXED-SEQUENCE) 

(INTBRMEDIATE-GROW-CIS  1 {CIRCULAR-INDEXED-SEQUENCE  2{INrECER 
3 {CIRCULAR-INDEXED-SEQUENCE) 

( INTERNED! ATE-UOAL-DELETE  1  :ANY  2  {UNORDERED-ASSOCIATIVE-LIST 

3 {LINKED-LIST  4 {UNORDERED-ASSOCIATIVE-LIST) 
(INTERPRET- INSTRUCTION  1  { INSTRUCTION  2:$B0UENCE  3  {EXECUTION-CONTEXT 

4{QUZUE  5:SBaUSlCE  6 { EXECUTION-CONTEXT  7:QUEUE) 
(ITBRATIVE-BVALUATJON  1:ANY  2:SBQUDICB  3  {EXBCOTJON-CGKTEXT  4  {QUEUE 
5:ANY  6:SEQUINCS  7 : EXECOTION-CONTEXT  8:QUEUE) 
(ITERATIVE-SEARCH  1 :ANY  2{ANT) 

(LE  1:LINKBD-LIST  2{SERIES) 

(LIST-EMPTY  1:  LINKED-LIST) 

(LIST-POP  1:LINKBD-LIST  2{ANy  3 : LINKED-LIST) 

(LIST-PUSH  l{ANy  2:LINKBD-LIST  3 {LINKED-LIST) 

(LIST-TO-SEQUBICB  1  :LINKED-LIST  2:INrB6ER  3:SEQUEMCE) 

(LOAD-ARGUMENTS  1  {MESSAGE  2:NODB  3  : NODE) 

(LOAD-ARGDNENTS-INTO-AN  1  {MESSAGE  2  :ASTNCH-NODE  3  :ASTNCH-NODE) 
(LOAD-ARGUKBrrS-INTO-NBIORY  1  {MESSAGE  2  {ASSOCIATIVE-SET 
3  {ASSOCIATIVE-SET) 

(LOAD-ARGUNENTS-INTO-SN  1  {MESSAGE  2:STNCH-NODE  3  :STNCH-NOOE) 
(tOCAL-BUFFBR-DQ  1: SYNCH-NODE  2 {MESSAGE  3 {SYNCH-NODE) 
(LOCAL-BUFFER-DGTY?  1  :STNCH-NODB) 

(LOCAL-BUFFBR-NON8MFTY7  1 :  SYNCH-NODE) 

(LOCAL-BUFFBR-NQ  1  {MESSAGE  2:STNCH-NODE  3  :STNCH-NODB) 
(LOCAL-BUFFBRS-ALMAYS-BMPTT?  1  {SERIES} 

(LOCAL-BUFFBRS-ENPTY7  1  {SEQUENCE) 

(LOOKUP-AND-EXBCOTB-HANDLBR  1  {MESSAGE  2 {SEQUENCE  3  {QUEUE  4 {INTEGER 
5:STMBOL  6:SBQUBNCB  7  {QUKTB) 
(LOOKUP-DESTINATION  1  {SEQUENCE  2{NB88AGE  3:ANY) 

(LOOKUP-HANDLER  1  {SYMBOL  2{HAMDLBR) 

(LOOKUP-HANDLBR-FOK-MESSAGE  1  {MESSAGE  2 {HANDLER) 

(L00KUP-N0DE»ND4UPDATE  1  {MESSAGE  2{8EQU1>1CB  3{$BQUBICB) 

(MAX  ItlNTBOER  2:INrBOBR  StlNTBOBR) 

(MIN  liINTBOBR  2{INrB6SR  3{INrBOXR) 

(NSGATB-IF-NBQATIVB  1: INTEGER  2 {INTEGER) 

(NBH-SEQUBVB  1  {INTEGER  2{8BQUBNCB) 

(NBW-TBRN  l{8EaOBICB  2{INTBOER  3{ANT  4{8BQUBICE) 

(OAL-RETRIEVE-IF-EXISTS  1  {ANY  2 {ORDERED-ASSOCIATIVE-LIST  3: ANY  4: ANY) 
(OAL-SPLXCE-ZN  1 {SERIES  2:ANY  3 :ORDER80-A5SOCXATIVE-LIST 
4 :ORDBRBD-AS80CIATIVE-LIST) 
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307  (O^L*SPLlCS>OUT  ItSUlCS  2-.0RDERBD>AS80ClXTlVE-LlST 
3  :ORDBRBO*A5SOCIATIVB>LIST} 

307  (ORDERfiD-ASSOC’LE  1 :OROERBD-ASSOCIATIVB-LIST  2:ANY  3:SCAIES) 
306  (OROEREO‘A$SOC-LXST>DBLETE  1 :ANY  2 :OROERED-ASSOCIATIVE>UST 

3 ;ORDERED-A$SOCIATIVE<LXST) 

309  (OROERBD*ASSOC>LIST’EXTRACT  1 :ORDERBO*ASSOCIATIVE>LIST  2:ANY 

3 :0RDERED-ASS0C1AT1VE*LIST) 

305  (ORDERED>ASSOC>LIST-lNSERT  1:ANY  2:AKy 

3 :ORDERED*ASSOCIATlVE-LtST 
4  tOROEREO-ASSOClATtVE-LtST) 

306  (ORDERED-ASSOC'LIST-lNSERT>SAFE  1:ANY  2:ANY 

3 :OROERED-ASSOCIATIVE‘LIST 
4 :OROEREO-ASSOCIATlVE-LIST> 

306  (ORDEREO-ASSOC-LIST> INSERT-UNSAFE  l:ANy  2:ANy 

3 :OROEREO-ASSOCIATIVE-LtST 
4 : OROBRBO-ASSOCIATIVE-LIST) 

307  (ORDERED-ASSOC-LIST-LOOKUP  l:ANy  2 :ORDEREO-ASSOCXATIVE-LIST 

3;ANy) 

307  (OROERBD-ASSOrr-SLE  1  :ORDERED-ASSOClATXVE-LXST  2:ANy 
3:SBRIES) 

293  ( POLL-NODES -AND-OO-WORK  1:SEQUENCE  2:SEiQUBICE  3:QUEUE) 

305  (PQ-EMPTY  1 : PRIORITY-QUEUE) 

305  (PQ-ENUMERATION  1 :  PRIORITY -QUEUE  2:ANY) 

305  (PQ-EXTRACT  1 :  PRIORITY -QUEUE  2:ANY  3  :  PRIORITY-QUEUE) 

305  (PQ-INSERT  1:ANY  2:ANY  3: PRIORITY-QUEUE  4 .'PRIORITY -QUEUE) 

291  (PROCESS-EVENT  1:EVSIT  2: PRIORITY -QUEUE  3:SEQUENCE 
4:PRIORITy-QUEUE  5:SBQUD)CE) 

302  (PROPERTY-LIST-LOOKUP  l:SYKBOL  2:SYNBOL  3:ANY) 

295  (QUEUE-EMPTY?  liQUEUE) 

295  (QUEUE-EXTRACT  1:QUSUE  2:ANY  3  .-QUEUE) 

295  (QUEUE- INSERT  liANY  2:QUEUE  3 : QUEUE) 

304  (RECORO-AT-DBSTINATION  1  :ANY  2:KBSSAGE  3:SEQUSICE  4:SEaUENCE} 
312  (REVERSE-LIST  1 :LINRBD-LIST  2:LINREO-LIST) 

297  (ROOMY-CIS-ADD  ItANY  2 tClRCULAR-INDEXEO-SEQUCMCE 
3 :CIRCULAR-INDEXBO-SSQUDICE) 

299  (RUNNING-STATUS?  IsEXECOTIOM-CQMTEXT) 

299  (RUtWING-TEST  l:SYMBOL) 

310  (SELECT-TERM  1:SEQUENCE  2:INTE6ER  3:ANy) 

310  (SELECT-TBRM-MAP  1:SBQUD1CE  2:SERIES  3:SERIES) 

311  (SEQ-LZST-SEARCH  1 ?LZNRSO-LlST  2:ANy) 

312  (SEQUmCE-ACCUMULATE  1:SER1ES  2:SBR1ES  3:SEQUEMCB  4:SEQUEMC£) 

312  (SEQUENCE-ACCUMULATION  l:ANy  2:INrfi6ER  3:SEQUSICE  4:SBQUEMCE) 

313  (SEQUBNCE-AND-lNDEX-aiUMERATlON  1:SEQUENCE  2:SERIES  3:SERICS) 
312  (SEQUENCE-ENUMERATION  1:SEQUENCE  2:SBR1ES) 

312  (SBQUIDICS-SZES  1  .'SEQUENCE  2:lNrEG£R) 

311  (SEQUENTIAL-SEARCH  IsSERIES  2:ANy) 

291  (SEQUENTIAL-SINULATIQN-OF-MESSAGB-PASSING-SYSTEM 

liSEQUENCE  2:AHy  3:SEgUB«CE) 

311  (SLE  1:LINKED-L1ST  2:SES1BS) 

313  (SQUARS-ROOT-OP-SQUARE  1;  INTEGER  2:  INTEGER) 

296  (STACK-EMPTY?  1:STACK) 

295  (STACR-BNUMBRATION  1:STACK  2:SERIBS) 

296  (STACK-POP  1: STACK  2:ANY  3: STACK) 

296  (STACK-PUSH  1:ANY  2iSTACK  3:STACK) 

313  (SUN  1  .'SERIES  2:INTEGBJi) 

313  (SUMIIN6  1:INTEGER  2:INTBGER  3:INTBGER) 

294  (SYNCHRONOUS-SIMULATION  1:SBQUENCE  2:NESSAG£  3:SEQUBICB) 

293  (SYNCHRONOUS-SINULATION-FINISHSD?  1:SBQUD1CE  2:QUBUE 

3: SEQUENCE) 

294  (SYNCHRONOUS-SIMULATION-W-GLOBAL-MBSSAGB-BUFFER 

1:  SEQUENCE  2:MB88AGB  3:SEQUBICE) 

313  (TEST-PREDICATE  1:ANY) 

312  (TRAILING-GENERATE  1:AHY  3:SBRIE8  3:SBRIES) 

312  (TRAILINO-OBNIRATION  1:ANY  2: ANY  3: ANY) 

312  (TRAILING-PTR-LB  1 :LINRBD-L18T  2:SBR1B$  3:SBRIBS) 

311  (TRUNCATE  1:SBRIBS  2:SERIB8} 

309  (TRUNCATE-BQUAL-PRIORITY  1:8BRIBS  2:ANY  3:8BRtBS) 

309  (TRUNCATB-EQUAL-PRIORITY-HSAO  1:8BRIE5  2:ANY  3:8BRIBS) 

309  (TRUNCATE-OAL-POSITION  1:8BRIBS  2:ANY  3:SER1B8) 

307  (TRUNCATE-OAL-POS1TION-UN8AFE  1:SBR1ES  2:ANY  3:SERIES) 

311  (TRUNCATION  1:ANY  2:ANT) 

313  (UNARY-PREDICATE  1:ANY  2:ANY) 

305  (UNOROBRBD-ASSOC’LZST-OBLETB 

1:ANY  2:UIIOROBREO-ASSOC1ATIVE-L1ST 
3  :UM0R0BRBD-A890CIATIVB-L18T) 

305  (UNOROBRED-ASSOC-LIST-BMPTY?  1  :UN0ROBRBD-A88OCIATtVB-LI8T) 

305  (UNOROBRBD-ASSOC-LI8T-INSERT  1  :ANT 

2  .'UNORDIRED-ASSOCIATIVE-LIST 
3 :UN0RDBRBD-AS90CIATtVE-LlST) 

305  (UNOROBRED-ASSOC-LIST-LOOKUP 

1:ANY  2:UNORDBRBD-A880C1ATIVB-LI8T  3:ANY) 

301  (UPOATE^BUKP  1  tANY  2:1NDBXBD-SBQUBICB  3  ilNDBXBO-SBQUBICE) 

301  (UP0ATE4FCTCH  1 : ZNDSXED-SBQUENCE  2:ANY  3:INDBXED-SEQUSICB) 
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